Skip to content

Adaptive filter scheduling in the Parquet decoder (replaces PR #9)#11

Open
adriangb wants to merge 14 commits intomainfrom
adaptive-filters-in-decoder
Open

Adaptive filter scheduling in the Parquet decoder (replaces PR #9)#11
adriangb wants to merge 14 commits intomainfrom
adaptive-filters-in-decoder

Conversation

@adriangb
Copy link
Copy Markdown
Owner

@adriangb adriangb commented Apr 27, 2026

Summary

Replaces PR #9's morsel-per-row-group split with in-decoder strategy swap: one ParquetPushDecoder per file, one BoxStream per file, filter placement re-evaluated at every row-group boundary using the shared SelectivityTracker.

Filters can now adapt mid-stream (between row groups) without splitting files into chunks. The arrow-rs companion change adds a small ParquetPushDecoder::swap_strategy API; the DataFusion side uses it from a single adaptive stream wrapper.

arrow-rs companion branch

This PR depends on pydantic/arrow-rs branch adaptive-strategy-swap (CI green at pydantic/arrow-rs#9), referenced via [patch.crates-io] in the workspace Cargo.toml.

The arrow-rs additions:

  • pub fn can_swap_strategy(&self) -> bool — true between row groups.
  • pub fn swap_strategy(&mut self, swap: StrategySwap) -> Result<()> — replaces projection / row filter / row selection policy at a row-group boundary; rejected mid-row-group.
  • pub struct StrategySwap (#[non_exhaustive]) with builder methods.
  • pub fn row_groups_remaining(&self) -> usize for diagnostics.

PushBuffers carries through the swap, so bytes already fetched for columns that survive the new strategy are reused.

What's removed (vs PR #9)

  • The chunk loop (ParquetAccessPlan::split_into_chunks, Vec<BoxStream> returns from build_stream).
  • Per-chunk AsyncFileReader::create_reader minting and per-chunk RowFilter rebuild (RowFilter is !Clone).
  • The EarlyStoppingStream-on-chunk-0-only special case for the non-Clone FilePruner.
  • LazyMorselShared per-morsel Arc churn — the source of the ~10% aggregate ClickBench regression flagged in PR Adaptive filter scheduling + row-group morsel split #9 review.

What's added

AdaptiveParquetStream in opener.rs drives one row group at a time via try_next_reader:

  1. Pull a ParquetRecordBatchReader for the next row group.
  2. Iterate synchronously; each batch goes through any post-scan filters (which feed per-filter stats into the tracker) and then through the projector.
  3. When the reader exhausts, ask the tracker to re-partition filters based on accumulated stats. If the row-filter set changed, build a new RowFilter and call decoder.swap_strategy(...) before requesting the next reader. Post-scan filters update in lockstep.

PushBuffers carries through the swap so already-fetched bytes are preserved. The optional-filter mid-stream skip mechanism (existing OptionalFilterPhysicalExpr + tracker.is_filter_skipped) keeps working unchanged inside apply_post_scan_filters_with_stats.

Carried-over machinery (file-level checkout from dbcf5ac1e)

  • selectivity.rsSelectivityTracker, PartitionedFilters, FilterId, Welford CI bounds.
  • row_filter.rs — new build_row_filter signature returning (Option<RowFilter>, UnbuildableFilters) plus total_compressed_bytes, plus DatafusionArrowPredicate stat hooks.
  • physical_expr.rsOptionalFilterPhysicalExpr, snapshot_generation helpers. Display is pass-through here (PR Adaptive filter scheduling + row-group morsel split #9 used Optional(...)), keeping every existing sqllogictest expected output intact.
  • config.rs — adds filter_pushdown_min_bytes_per_sec / filter_collecting_byte_ratio_threshold / filter_confidence_z. reorder_filters is preserved as a deprecated no-op — the adaptive tracker subsumes it.
  • Per-file plumbing in source.rs: predicate_conjuncts: Vec<(FilterId, Arc<PhysicalExpr>)> instead of a single AND-ed predicate so per-conjunct stats accumulate across files.

Deferred

  • Sub-row-group adaptation (would need ParquetRecordBatchReader::pause in arrow-rs to yield a residual RowSelection). Useful for TPCDS-style single-huge-row-group files. Out of scope here.
  • Three new config knobs aren't in the proto schema yet; from_proto fills with config defaults so a roundtrip preserves behavior. Worth a follow-up to plumb them through the proto.

Known pre-existing CI flake

datafusion/sqllogictest/test_files/explain_analyze.slt:103 (the output_rows_skew skew metric test) is failing on apache/datafusion main itself — see run 25027102370 on commit 310dd5d4, identical diff expected 84.31% / actual 100%. Not introduced by this branch. Fixing it is out of scope; this PR matches the pre-existing CI baseline.

Test plan

  • cargo fmt --all
  • cargo clippy --workspace --all-targets ... -- -D warnings
  • cargo test -p datafusion-datasource-parquet --lib — 143 passed
  • cargo test -p datafusion --lib — 410 passed
  • cargo test -p datafusion --test core_integration — 935 passed
  • cargo doc --workspace --no-deps — clean
  • cargo run --example data_iojson_shredding passes with pushdown_rows_pruned=1
  • cargo test -p datafusion-sqllogictest --test sqllogictests — pass except the pre-existing explain_analyze.slt and encrypted_parquet.slt flakes that fail on apache/main
  • CI (in progress — currently re-running after the doc/configs.md/json_shredding fixes pushed in 608f280c4/32195d6da/4a4e300eb)
  • ClickBench (pre/post): aggregate + per-query. Expected: ~10% regression from LazyMorselShared churn disappears
  • Hash-join dynamic filter blog-post benchmark (small_table JOIN large_table with WHERE small_table.v >= 50): unchanged from main

🤖 Generated with Claude Code

@adriangb
Copy link
Copy Markdown
Owner Author

run benchmark clickbench_partitioned

baseline:
    ref: main
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: false
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: false
changed:
    ref: HEAD
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: true
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: true

adriangb and others added 3 commits April 27, 2026 23:49
Replaces PR #9's morsel-per-row-group split with in-decoder strategy
swap. One `ParquetPushDecoder` per file, one `BoxStream` per file,
filter placement re-evaluated at every row-group boundary using the
shared `SelectivityTracker`.

# What's removed (vs PR #9)

- The chunk loop (`ParquetAccessPlan::split_into_chunks`,
  `Vec<BoxStream>` returns from `build_stream`).
- Per-chunk `AsyncFileReader::create_reader` minting and per-chunk
  `RowFilter` rebuild.
- The `EarlyStoppingStream`-on-chunk-0-only special case for the
  non-`Clone` `FilePruner`.
- `LazyMorselShared` per-morsel Arc churn — the source of the ~10%
  aggregate ClickBench regression you flagged in PR #9 review.

# What's added

`AdaptiveParquetStream` (new in `opener.rs`) drives one row group at a
time via `try_next_reader`:

1. Pull a `ParquetRecordBatchReader` for the next row group.
2. Iterate it synchronously; each batch goes through any post-scan
   filters (which feed per-filter stats into the tracker) and then
   through the projector.
3. When the reader exhausts, ask the tracker to re-partition filters
   based on accumulated stats. If the row-filter set changed, build
   a new `RowFilter` and call the new arrow-rs
   `ParquetPushDecoder::swap_strategy` before requesting the next
   reader. Post-scan filters update in lockstep.

`PushBuffers` carries through the swap so already-fetched bytes are
preserved, and the optional-filter mid-stream skip mechanism (existing
`OptionalFilterPhysicalExpr` + `tracker.is_filter_skipped`) keeps
working unchanged inside `apply_post_scan_filters_with_stats`.

# Carried-over machinery (file-level checkout from `dbcf5ac1e`)

- `selectivity.rs` — `SelectivityTracker`, `PartitionedFilters`,
  `FilterId`, Welford CI bounds. Verbatim.
- `row_filter.rs` — new `build_row_filter` signature returning
  `(Option<RowFilter>, UnbuildableFilters)` plus
  `total_compressed_bytes`, plus `DatafusionArrowPredicate` stat hooks.
- `physical_expr.rs` — `OptionalFilterPhysicalExpr`, `snapshot_generation`
  helpers. `Display` is **pass-through** here (PR #9 used
  `Optional(...)`), keeping every existing sqllogictest expected output
  intact.
- `config.rs` — adds `filter_pushdown_min_bytes_per_sec` /
  `filter_collecting_byte_ratio_threshold` / `filter_confidence_z`.
  **`reorder_filters` is preserved as a deprecated no-op** (per
  request) — the adaptive tracker subsumes it.
- `selectivity_tracker.rs` bench — verbatim.
- Per-file plumbing in `source.rs`: `predicate_conjuncts:
  Vec<(FilterId, Arc<PhysicalExpr>)>` instead of a single AND-ed
  predicate so per-conjunct stats accumulate across files.

# arrow-rs companion branch

Depends on `pydantic/arrow-rs:adaptive-strategy-swap`, which adds
`ParquetPushDecoder::can_swap_strategy()` /
`swap_strategy(StrategySwap)` and the `StrategySwap` builder. The
`Cargo.toml` `[patch.crates-io]` block points at it.

# What's not in this PR (deferred)

- Sub-row-group adaptation (would need a `ParquetRecordBatchReader::pause`
  primitive in arrow-rs to yield a residual `RowSelection`); useful for
  TPCDS-style single-huge-row-group files. Defer.
- Three new config knobs aren't in the proto schema yet; `from_proto`
  fills with config defaults so a roundtrip preserves behavior.

# Tests

- `cargo test -p datafusion-datasource-parquet --lib` — 143 passed
- `cargo test -p datafusion --lib` — 410 passed
- `cargo test -p datafusion --test core_integration` — 935 passed
- `cargo test -p datafusion-sqllogictest --test sqllogictests` — all
  pass except `encrypted_parquet.slt` (pre-existing on upstream/main,
  not related to this change)

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- Fix 6 broken intra-doc links in `opener.rs`: `RowFilter`,
  `PushBuffers`, `AsyncFileReader::create_reader`, `SelectivityTracker`
  weren't visible from the doc-comment scope. Reword to plain backticks
  for the names that don't have a stable in-scope path; route
  `SelectivityTracker` through `crate::selectivity::SelectivityTracker`.
- Regenerate `docs/source/user-guide/configs.md` via
  `dev/update_config_docs.sh` to surface the three new
  `filter_pushdown_min_bytes_per_sec` /
  `filter_collecting_byte_ratio_threshold` / `filter_confidence_z`
  rows the CI doc check expects.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…33dd62

Picks up the rustdoc fix from the arrow-rs companion branch so the
DataFusion CI doc job resolves clean too.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@adriangb adriangb force-pushed the adaptive-filters-in-decoder branch from 4a4e300 to d379196 Compare April 28, 2026 04:49
The example asserts `pushdown_rows_pruned=1` to demonstrate that the
row-filter path actually evicts rows. Under the adaptive scheduler's
default `filter_pushdown_min_bytes_per_sec = 100 MB/s`, a small
example file's filter starts on the post-scan path (where
`pushdown_rows_pruned` stays 0) and the assertion fires.

Set `filter_pushdown_min_bytes_per_sec = 0` to disable the throughput
check and force every filter to row-level — the same lever
`physical_plan/parquet.rs` test harness uses.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.20 / 4.65 ±6.80 / 18.26 ms │          1.20 / 4.68 ±6.86 / 18.39 ms │     no change │
│ QQuery 1  │        12.27 / 12.57 ±0.22 / 12.86 ms │        13.33 / 13.82 ±0.27 / 14.07 ms │  1.10x slower │
│ QQuery 2  │        36.29 / 36.69 ±0.36 / 37.22 ms │        36.64 / 37.48 ±0.80 / 38.85 ms │     no change │
│ QQuery 3  │        31.44 / 31.98 ±0.48 / 32.82 ms │        30.72 / 30.99 ±0.17 / 31.24 ms │     no change │
│ QQuery 4  │     240.78 / 243.61 ±1.72 / 245.93 ms │     239.70 / 243.89 ±2.91 / 248.66 ms │     no change │
│ QQuery 5  │     282.82 / 284.87 ±2.14 / 288.68 ms │     278.00 / 281.21 ±2.87 / 285.29 ms │     no change │
│ QQuery 6  │           6.55 / 7.15 ±0.62 / 8.23 ms │           5.17 / 5.60 ±0.34 / 6.04 ms │ +1.28x faster │
│ QQuery 7  │        13.61 / 13.69 ±0.06 / 13.78 ms │        14.44 / 15.30 ±1.28 / 17.83 ms │  1.12x slower │
│ QQuery 8  │     328.56 / 331.74 ±2.30 / 335.10 ms │     322.65 / 325.32 ±3.11 / 331.41 ms │     no change │
│ QQuery 9  │     449.06 / 454.08 ±3.55 / 458.84 ms │    446.07 / 457.41 ±10.38 / 475.93 ms │     no change │
│ QQuery 10 │        73.33 / 76.09 ±4.58 / 85.18 ms │        70.27 / 70.78 ±0.29 / 71.06 ms │ +1.07x faster │
│ QQuery 11 │        85.09 / 86.60 ±2.18 / 90.93 ms │        80.24 / 81.52 ±1.15 / 83.55 ms │ +1.06x faster │
│ QQuery 12 │     275.04 / 279.02 ±5.23 / 289.31 ms │     262.93 / 268.44 ±3.67 / 274.38 ms │     no change │
│ QQuery 13 │     385.52 / 401.00 ±9.47 / 414.62 ms │    408.66 / 419.26 ±10.70 / 438.44 ms │     no change │
│ QQuery 14 │     287.44 / 292.88 ±4.90 / 301.28 ms │     270.98 / 273.95 ±2.64 / 278.54 ms │ +1.07x faster │
│ QQuery 15 │     284.42 / 288.84 ±3.91 / 295.44 ms │     285.24 / 288.24 ±3.22 / 292.87 ms │     no change │
│ QQuery 16 │     621.23 / 627.64 ±4.73 / 633.45 ms │     600.47 / 607.86 ±6.55 / 616.08 ms │     no change │
│ QQuery 17 │     620.90 / 625.90 ±3.04 / 629.74 ms │     601.61 / 605.55 ±2.95 / 610.41 ms │     no change │
│ QQuery 18 │ 1246.89 / 1264.89 ±13.38 / 1286.35 ms │  1195.10 / 1207.90 ±8.87 / 1221.69 ms │     no change │
│ QQuery 19 │        28.08 / 29.06 ±0.98 / 30.75 ms │        27.67 / 33.30 ±7.48 / 46.71 ms │  1.15x slower │
│ QQuery 20 │     518.77 / 528.33 ±9.72 / 540.93 ms │     512.74 / 518.72 ±4.74 / 526.79 ms │     no change │
│ QQuery 21 │     601.54 / 604.79 ±3.81 / 612.10 ms │     595.71 / 600.18 ±3.64 / 604.71 ms │     no change │
│ QQuery 22 │ 1060.94 / 1078.07 ±10.71 / 1091.29 ms │  1178.47 / 1187.85 ±5.38 / 1194.08 ms │  1.10x slower │
│ QQuery 23 │ 3340.33 / 3360.25 ±20.71 / 3393.86 ms │     737.03 / 748.76 ±7.12 / 757.40 ms │ +4.49x faster │
│ QQuery 24 │        42.17 / 43.83 ±1.66 / 46.30 ms │        49.37 / 52.33 ±3.31 / 57.76 ms │  1.19x slower │
│ QQuery 25 │     113.94 / 115.89 ±2.08 / 119.79 ms │     115.50 / 117.84 ±2.80 / 123.04 ms │     no change │
│ QQuery 26 │        42.79 / 43.29 ±0.42 / 43.86 ms │        50.38 / 54.08 ±5.98 / 65.98 ms │  1.25x slower │
│ QQuery 27 │     666.99 / 676.46 ±6.61 / 686.68 ms │     646.04 / 650.54 ±3.63 / 655.57 ms │     no change │
│ QQuery 28 │ 3026.05 / 3035.07 ±11.63 / 3057.33 ms │ 2982.49 / 2998.33 ±10.56 / 3013.34 ms │     no change │
│ QQuery 29 │        42.17 / 45.44 ±4.47 / 53.88 ms │       41.88 / 52.71 ±20.77 / 94.24 ms │  1.16x slower │
│ QQuery 30 │     312.22 / 316.40 ±4.94 / 322.52 ms │     304.24 / 308.91 ±3.66 / 313.73 ms │     no change │
│ QQuery 31 │     305.96 / 313.63 ±7.52 / 326.53 ms │     364.20 / 368.76 ±2.86 / 371.61 ms │  1.18x slower │
│ QQuery 32 │ 1013.66 / 1037.86 ±39.79 / 1117.22 ms │    942.74 / 955.16 ±15.88 / 986.41 ms │ +1.09x faster │
│ QQuery 33 │ 1434.31 / 1451.40 ±24.00 / 1498.49 ms │ 1413.88 / 1425.93 ±12.07 / 1442.25 ms │     no change │
│ QQuery 34 │ 1433.52 / 1471.97 ±38.45 / 1545.56 ms │ 1423.88 / 1436.39 ±12.25 / 1459.77 ms │     no change │
│ QQuery 35 │     286.54 / 294.89 ±6.75 / 306.23 ms │    285.93 / 298.37 ±16.24 / 330.32 ms │     no change │
│ QQuery 36 │        63.90 / 70.97 ±6.88 / 80.46 ms │        62.03 / 66.63 ±3.93 / 72.29 ms │ +1.07x faster │
│ QQuery 37 │        35.35 / 36.51 ±1.54 / 39.48 ms │       35.26 / 42.41 ±11.58 / 65.39 ms │  1.16x slower │
│ QQuery 38 │        39.64 / 45.24 ±4.16 / 50.52 ms │        41.28 / 43.28 ±1.70 / 46.21 ms │     no change │
│ QQuery 39 │     127.64 / 132.76 ±3.53 / 137.94 ms │     116.32 / 127.29 ±6.57 / 136.65 ms │     no change │
│ QQuery 40 │        14.24 / 14.76 ±0.32 / 15.14 ms │        14.12 / 17.86 ±3.90 / 23.20 ms │  1.21x slower │
│ QQuery 41 │        13.62 / 13.81 ±0.11 / 13.94 ms │        13.80 / 15.49 ±2.95 / 21.36 ms │  1.12x slower │
│ QQuery 42 │        13.39 / 15.41 ±2.43 / 18.86 ms │        13.36 / 13.79 ±0.48 / 14.59 ms │ +1.12x faster │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 20139.97ms │
│ Total Time (adaptive-filters-in-decoder)   │ 17374.08ms │
│ Average Time (HEAD)                        │   468.37ms │
│ Average Time (adaptive-filters-in-decoder) │   404.05ms │
│ Queries Faster                             │          8 │
│ Queries Slower                             │         11 │
│ Queries with No Change                     │         24 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 30.5 GiB
Avg memory 23.1 GiB
CPU user 1062.7s
CPU sys 65.0s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 90.0s
Peak memory 30.6 GiB
Avg memory 23.4 GiB
CPU user 923.2s
CPU sys 52.3s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.19 / 4.63 ±6.79 / 18.21 ms │          1.20 / 4.65 ±6.79 / 18.23 ms │      no change │
│ QQuery 1  │        12.27 / 12.54 ±0.15 / 12.69 ms │        13.71 / 13.92 ±0.14 / 14.11 ms │   1.11x slower │
│ QQuery 2  │        36.14 / 36.64 ±0.52 / 37.46 ms │        36.11 / 36.40 ±0.32 / 37.02 ms │      no change │
│ QQuery 3  │        31.65 / 31.94 ±0.45 / 32.84 ms │        30.93 / 31.25 ±0.19 / 31.50 ms │      no change │
│ QQuery 4  │     240.57 / 245.46 ±3.18 / 249.17 ms │     244.06 / 246.53 ±1.56 / 247.85 ms │      no change │
│ QQuery 5  │     281.95 / 284.32 ±1.31 / 285.75 ms │     278.28 / 280.13 ±1.43 / 282.65 ms │      no change │
│ QQuery 6  │           6.63 / 7.07 ±0.34 / 7.55 ms │           5.10 / 5.33 ±0.17 / 5.55 ms │  +1.33x faster │
│ QQuery 7  │        13.50 / 13.72 ±0.13 / 13.88 ms │        14.62 / 15.50 ±1.31 / 18.12 ms │   1.13x slower │
│ QQuery 8  │     325.76 / 327.80 ±1.37 / 329.42 ms │     316.74 / 321.90 ±5.20 / 331.74 ms │      no change │
│ QQuery 9  │     448.85 / 453.20 ±6.38 / 465.77 ms │     445.69 / 451.38 ±4.56 / 458.49 ms │      no change │
│ QQuery 10 │        73.44 / 76.29 ±3.72 / 83.61 ms │        70.14 / 70.82 ±0.75 / 72.21 ms │  +1.08x faster │
│ QQuery 11 │        85.40 / 86.00 ±0.89 / 87.76 ms │        81.45 / 85.34 ±7.18 / 99.69 ms │      no change │
│ QQuery 12 │     274.52 / 280.22 ±3.81 / 285.91 ms │     263.79 / 268.26 ±3.72 / 274.37 ms │      no change │
│ QQuery 13 │     393.20 / 399.01 ±6.63 / 411.89 ms │     408.89 / 421.96 ±8.52 / 432.31 ms │   1.06x slower │
│ QQuery 14 │     283.25 / 289.04 ±4.69 / 297.46 ms │     272.64 / 275.70 ±1.71 / 277.36 ms │      no change │
│ QQuery 15 │     277.92 / 284.53 ±6.42 / 295.81 ms │     290.06 / 293.41 ±5.33 / 304.01 ms │      no change │
│ QQuery 16 │     615.14 / 626.04 ±7.29 / 635.19 ms │     597.22 / 606.81 ±5.12 / 611.11 ms │      no change │
│ QQuery 17 │     615.52 / 623.50 ±5.39 / 631.11 ms │     602.50 / 608.76 ±3.91 / 613.12 ms │      no change │
│ QQuery 18 │ 1257.61 / 1279.03 ±19.49 / 1308.78 ms │  1207.50 / 1212.03 ±3.19 / 1217.02 ms │  +1.06x faster │
│ QQuery 19 │        28.14 / 28.62 ±0.33 / 29.16 ms │        27.93 / 30.65 ±4.64 / 39.90 ms │   1.07x slower │
│ QQuery 20 │     517.35 / 524.29 ±8.15 / 539.84 ms │     515.01 / 522.84 ±4.95 / 528.75 ms │      no change │
│ QQuery 21 │     592.26 / 597.66 ±4.15 / 603.63 ms │     624.22 / 636.18 ±6.68 / 642.39 ms │   1.06x slower │
│ QQuery 22 │ 1057.65 / 1074.23 ±15.16 / 1099.96 ms │    881.79 / 895.39 ±10.12 / 910.84 ms │  +1.20x faster │
│ QQuery 23 │ 3324.08 / 3369.44 ±29.73 / 3412.93 ms │     167.72 / 173.41 ±3.87 / 179.55 ms │ +19.43x faster │
│ QQuery 24 │        41.78 / 45.57 ±3.71 / 50.86 ms │        31.51 / 34.44 ±3.52 / 41.34 ms │  +1.32x faster │
│ QQuery 25 │     112.74 / 114.50 ±2.26 / 118.85 ms │     116.86 / 120.13 ±4.70 / 129.46 ms │      no change │
│ QQuery 26 │        42.00 / 42.66 ±0.52 / 43.16 ms │        50.54 / 51.41 ±1.03 / 53.40 ms │   1.21x slower │
│ QQuery 27 │     669.64 / 676.19 ±6.63 / 685.29 ms │     643.96 / 647.59 ±2.28 / 651.02 ms │      no change │
│ QQuery 28 │ 2994.04 / 3012.72 ±10.75 / 3024.39 ms │  2987.68 / 2994.88 ±5.27 / 2999.41 ms │      no change │
│ QQuery 29 │        42.00 / 46.11 ±7.13 / 60.33 ms │        41.97 / 52.56 ±8.95 / 65.01 ms │   1.14x slower │
│ QQuery 30 │     306.97 / 313.06 ±5.74 / 323.75 ms │     304.74 / 309.10 ±3.08 / 314.07 ms │      no change │
│ QQuery 31 │     298.61 / 311.09 ±9.25 / 325.55 ms │     370.25 / 371.82 ±2.20 / 376.13 ms │   1.20x slower │
│ QQuery 32 │ 1006.15 / 1019.95 ±12.32 / 1041.98 ms │    948.15 / 962.75 ±14.61 / 989.11 ms │  +1.06x faster │
│ QQuery 33 │ 1413.47 / 1439.52 ±20.78 / 1475.73 ms │ 1409.88 / 1425.49 ±12.52 / 1441.46 ms │      no change │
│ QQuery 34 │ 1439.77 / 1463.52 ±22.50 / 1502.18 ms │ 1430.02 / 1443.59 ±16.02 / 1472.74 ms │      no change │
│ QQuery 35 │    290.59 / 300.22 ±14.04 / 327.12 ms │    287.00 / 310.22 ±25.89 / 351.44 ms │      no change │
│ QQuery 36 │        61.55 / 68.12 ±6.47 / 79.67 ms │        60.35 / 71.41 ±8.65 / 83.87 ms │      no change │
│ QQuery 37 │        35.59 / 36.09 ±0.51 / 36.82 ms │        34.98 / 39.50 ±5.46 / 49.54 ms │   1.09x slower │
│ QQuery 38 │        40.46 / 43.35 ±3.78 / 50.55 ms │        36.30 / 37.76 ±1.24 / 39.24 ms │  +1.15x faster │
│ QQuery 39 │     125.37 / 138.90 ±7.45 / 146.62 ms │     125.34 / 131.92 ±9.27 / 150.28 ms │  +1.05x faster │
│ QQuery 40 │        14.63 / 14.88 ±0.14 / 15.04 ms │        17.92 / 18.28 ±0.30 / 18.83 ms │   1.23x slower │
│ QQuery 41 │        13.92 / 14.05 ±0.08 / 14.15 ms │        16.86 / 17.78 ±1.39 / 20.56 ms │   1.27x slower │
│ QQuery 42 │        13.26 / 13.53 ±0.14 / 13.64 ms │        15.90 / 17.79 ±2.90 / 23.55 ms │   1.31x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 20069.25ms │
│ Total Time (adaptive-filters-in-decoder)   │ 16567.00ms │
│ Average Time (HEAD)                        │   466.73ms │
│ Average Time (adaptive-filters-in-decoder) │   385.28ms │
│ Queries Faster                             │          9 │
│ Queries Slower                             │         12 │
│ Queries with No Change                     │         22 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 30.5 GiB
Avg memory 23.1 GiB
CPU user 1061.3s
CPU sys 64.2s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 85.0s
Peak memory 29.7 GiB
Avg memory 23.3 GiB
CPU user 876.1s
CPU sys 50.6s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.20 / 4.59 ±6.70 / 17.99 ms │          1.21 / 4.62 ±6.72 / 18.06 ms │      no change │
│ QQuery 1  │        12.28 / 12.81 ±0.27 / 13.04 ms │        13.93 / 14.20 ±0.25 / 14.56 ms │   1.11x slower │
│ QQuery 2  │        36.49 / 37.01 ±0.40 / 37.51 ms │        36.86 / 37.56 ±0.93 / 39.39 ms │      no change │
│ QQuery 3  │        31.49 / 31.98 ±0.53 / 32.96 ms │        31.15 / 31.47 ±0.31 / 32.07 ms │      no change │
│ QQuery 4  │     239.02 / 245.06 ±3.58 / 249.53 ms │     246.94 / 248.59 ±1.57 / 250.56 ms │      no change │
│ QQuery 5  │     285.22 / 286.79 ±1.45 / 288.53 ms │     280.75 / 282.92 ±1.56 / 285.57 ms │      no change │
│ QQuery 6  │           6.61 / 7.84 ±0.68 / 8.48 ms │           5.24 / 5.80 ±0.43 / 6.52 ms │  +1.35x faster │
│ QQuery 7  │        13.71 / 14.51 ±1.42 / 17.33 ms │        15.01 / 16.21 ±2.08 / 20.37 ms │   1.12x slower │
│ QQuery 8  │     328.13 / 331.94 ±2.10 / 334.19 ms │     320.35 / 323.37 ±1.90 / 325.41 ms │      no change │
│ QQuery 9  │     450.14 / 455.83 ±4.30 / 461.04 ms │     449.07 / 458.77 ±8.99 / 473.20 ms │      no change │
│ QQuery 10 │        74.39 / 75.32 ±0.84 / 76.87 ms │        71.22 / 71.92 ±0.47 / 72.66 ms │      no change │
│ QQuery 11 │        85.28 / 87.04 ±1.33 / 89.38 ms │        82.61 / 83.58 ±0.90 / 84.97 ms │      no change │
│ QQuery 12 │     275.52 / 279.48 ±3.07 / 284.08 ms │     266.81 / 270.42 ±3.49 / 276.62 ms │      no change │
│ QQuery 13 │     397.41 / 404.80 ±6.68 / 416.77 ms │     418.68 / 428.00 ±6.51 / 438.37 ms │   1.06x slower │
│ QQuery 14 │     286.97 / 291.80 ±2.80 / 295.53 ms │    275.62 / 286.85 ±17.08 / 320.84 ms │      no change │
│ QQuery 15 │     284.05 / 290.63 ±9.14 / 308.62 ms │     286.77 / 289.89 ±3.52 / 296.55 ms │      no change │
│ QQuery 16 │     620.79 / 628.44 ±5.14 / 635.81 ms │     599.54 / 609.88 ±8.46 / 622.39 ms │      no change │
│ QQuery 17 │     622.64 / 632.18 ±7.02 / 641.41 ms │     603.13 / 613.15 ±6.13 / 622.28 ms │      no change │
│ QQuery 18 │ 1255.27 / 1280.97 ±17.72 / 1302.69 ms │  1205.14 / 1219.21 ±9.61 / 1233.83 ms │      no change │
│ QQuery 19 │        28.63 / 28.86 ±0.15 / 29.08 ms │       27.97 / 37.96 ±12.15 / 60.99 ms │   1.32x slower │
│ QQuery 20 │     522.44 / 531.66 ±7.23 / 542.94 ms │     515.76 / 522.91 ±6.36 / 534.59 ms │      no change │
│ QQuery 21 │     596.52 / 605.86 ±5.94 / 613.23 ms │     623.13 / 636.33 ±9.98 / 651.97 ms │   1.05x slower │
│ QQuery 22 │  1062.73 / 1069.95 ±8.34 / 1085.95 ms │     890.25 / 897.26 ±4.64 / 902.99 ms │  +1.19x faster │
│ QQuery 23 │ 3336.24 / 3375.85 ±27.34 / 3411.54 ms │     167.73 / 172.98 ±4.76 / 179.30 ms │ +19.52x faster │
│ QQuery 24 │        42.26 / 42.66 ±0.55 / 43.72 ms │        31.98 / 32.49 ±0.65 / 33.73 ms │  +1.31x faster │
│ QQuery 25 │     115.74 / 119.74 ±7.74 / 135.23 ms │     115.56 / 118.88 ±4.57 / 127.91 ms │      no change │
│ QQuery 26 │        42.60 / 43.33 ±0.55 / 44.07 ms │        49.85 / 52.84 ±3.22 / 58.50 ms │   1.22x slower │
│ QQuery 27 │     673.79 / 679.89 ±5.02 / 685.80 ms │     645.94 / 650.22 ±3.71 / 655.79 ms │      no change │
│ QQuery 28 │ 3019.82 / 3038.01 ±11.49 / 3051.22 ms │ 2998.69 / 3020.15 ±18.69 / 3046.96 ms │      no change │
│ QQuery 29 │        42.97 / 52.44 ±8.42 / 64.83 ms │        42.36 / 46.30 ±4.68 / 53.56 ms │  +1.13x faster │
│ QQuery 30 │     310.64 / 314.90 ±2.83 / 319.48 ms │     307.10 / 311.29 ±3.44 / 317.43 ms │      no change │
│ QQuery 31 │     302.37 / 317.07 ±9.35 / 330.17 ms │     370.50 / 373.16 ±3.13 / 379.17 ms │   1.18x slower │
│ QQuery 32 │  1007.83 / 1015.38 ±8.66 / 1032.27 ms │     961.47 / 969.22 ±7.76 / 982.88 ms │      no change │
│ QQuery 33 │ 1434.27 / 1461.27 ±14.19 / 1474.59 ms │ 1409.28 / 1440.84 ±29.27 / 1493.90 ms │      no change │
│ QQuery 34 │ 1443.08 / 1468.99 ±20.67 / 1501.90 ms │ 1433.03 / 1450.42 ±10.99 / 1465.73 ms │      no change │
│ QQuery 35 │    291.21 / 305.33 ±15.71 / 335.12 ms │    288.39 / 305.45 ±26.23 / 357.45 ms │      no change │
│ QQuery 36 │        62.14 / 66.30 ±5.03 / 73.72 ms │        64.18 / 71.17 ±7.42 / 85.55 ms │   1.07x slower │
│ QQuery 37 │        35.87 / 41.08 ±5.43 / 50.58 ms │        35.57 / 35.69 ±0.15 / 35.97 ms │  +1.15x faster │
│ QQuery 38 │        42.06 / 44.34 ±3.62 / 51.49 ms │        41.13 / 43.57 ±1.94 / 46.28 ms │      no change │
│ QQuery 39 │     129.02 / 134.29 ±3.99 / 139.42 ms │     122.64 / 131.79 ±6.89 / 141.02 ms │      no change │
│ QQuery 40 │        14.35 / 15.48 ±1.44 / 18.31 ms │        13.87 / 14.33 ±0.66 / 15.62 ms │  +1.08x faster │
│ QQuery 41 │        13.68 / 15.57 ±2.39 / 19.98 ms │        13.18 / 13.48 ±0.18 / 13.70 ms │  +1.15x faster │
│ QQuery 42 │        13.24 / 15.39 ±3.36 / 22.07 ms │        13.33 / 17.93 ±5.29 / 25.72 ms │   1.17x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 20202.64ms │
│ Total Time (adaptive-filters-in-decoder)   │ 16663.08ms │
│ Average Time (HEAD)                        │   469.83ms │
│ Average Time (adaptive-filters-in-decoder) │   387.51ms │
│ Queries Faster                             │          8 │
│ Queries Slower                             │          9 │
│ Queries with No Change                     │         26 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 29.5 GiB
Avg memory 23.0 GiB
CPU user 1068.2s
CPU sys 64.1s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 85.0s
Peak memory 30.6 GiB
Avg memory 23.3 GiB
CPU user 881.2s
CPU sys 50.6s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.17 / 4.56 ±6.69 / 17.95 ms │          1.19 / 4.58 ±6.73 / 18.03 ms │      no change │
│ QQuery 1  │        12.30 / 12.41 ±0.07 / 12.52 ms │        13.52 / 13.88 ±0.19 / 14.09 ms │   1.12x slower │
│ QQuery 2  │        36.22 / 36.65 ±0.37 / 37.16 ms │        36.70 / 36.97 ±0.23 / 37.25 ms │      no change │
│ QQuery 3  │        31.33 / 31.97 ±0.72 / 33.29 ms │        31.34 / 31.56 ±0.19 / 31.87 ms │      no change │
│ QQuery 4  │     237.70 / 241.53 ±3.23 / 246.25 ms │     238.48 / 241.55 ±2.43 / 245.19 ms │      no change │
│ QQuery 5  │     282.53 / 283.79 ±1.16 / 285.65 ms │     277.04 / 279.36 ±1.29 / 280.95 ms │      no change │
│ QQuery 6  │           6.62 / 7.20 ±0.45 / 7.76 ms │           5.31 / 5.90 ±0.45 / 6.41 ms │  +1.22x faster │
│ QQuery 7  │        13.31 / 13.79 ±0.59 / 14.94 ms │        14.58 / 14.91 ±0.26 / 15.37 ms │   1.08x slower │
│ QQuery 8  │     323.68 / 328.13 ±3.85 / 334.75 ms │     318.28 / 321.50 ±2.69 / 326.00 ms │      no change │
│ QQuery 9  │     445.88 / 455.35 ±8.39 / 469.01 ms │     446.53 / 450.62 ±4.07 / 457.87 ms │      no change │
│ QQuery 10 │        74.21 / 78.36 ±7.33 / 93.00 ms │        70.72 / 71.43 ±0.58 / 72.05 ms │  +1.10x faster │
│ QQuery 11 │        84.16 / 85.52 ±1.08 / 87.49 ms │        82.14 / 82.96 ±0.71 / 84.14 ms │      no change │
│ QQuery 12 │     274.11 / 280.80 ±5.66 / 287.20 ms │     263.83 / 267.28 ±3.02 / 272.32 ms │      no change │
│ QQuery 13 │     388.69 / 397.29 ±8.33 / 413.10 ms │     408.52 / 419.15 ±7.65 / 428.25 ms │   1.06x slower │
│ QQuery 14 │     284.89 / 289.55 ±5.81 / 300.92 ms │     271.95 / 278.29 ±5.48 / 287.86 ms │      no change │
│ QQuery 15 │     281.46 / 286.96 ±3.14 / 290.80 ms │     284.00 / 288.95 ±7.38 / 303.49 ms │      no change │
│ QQuery 16 │     622.27 / 626.39 ±3.86 / 631.29 ms │     604.67 / 611.32 ±8.06 / 627.02 ms │      no change │
│ QQuery 17 │     618.28 / 621.90 ±2.62 / 625.04 ms │     606.42 / 609.45 ±2.04 / 612.50 ms │      no change │
│ QQuery 18 │  1259.66 / 1264.75 ±7.59 / 1279.73 ms │  1205.17 / 1212.85 ±6.79 / 1222.46 ms │      no change │
│ QQuery 19 │        28.49 / 33.45 ±9.81 / 53.06 ms │        28.28 / 30.76 ±3.50 / 37.72 ms │  +1.09x faster │
│ QQuery 20 │     521.53 / 531.73 ±6.71 / 539.77 ms │     515.66 / 521.72 ±4.65 / 529.73 ms │      no change │
│ QQuery 21 │     595.15 / 605.68 ±7.03 / 616.74 ms │    632.64 / 653.10 ±17.31 / 674.44 ms │   1.08x slower │
│ QQuery 22 │  1067.46 / 1072.67 ±5.08 / 1080.27 ms │     894.17 / 902.01 ±5.91 / 911.03 ms │  +1.19x faster │
│ QQuery 23 │ 3331.77 / 3365.20 ±24.28 / 3396.44 ms │     162.99 / 169.60 ±3.82 / 173.06 ms │ +19.84x faster │
│ QQuery 24 │        42.77 / 44.69 ±2.88 / 50.40 ms │        31.59 / 36.25 ±4.51 / 42.02 ms │  +1.23x faster │
│ QQuery 25 │     114.05 / 115.52 ±1.01 / 117.18 ms │     115.04 / 117.16 ±1.25 / 118.86 ms │      no change │
│ QQuery 26 │        42.47 / 43.58 ±1.22 / 45.76 ms │        51.22 / 52.12 ±0.61 / 52.90 ms │   1.20x slower │
│ QQuery 27 │     671.02 / 677.58 ±5.10 / 686.30 ms │     647.61 / 654.47 ±4.88 / 662.50 ms │      no change │
│ QQuery 28 │ 3010.90 / 3029.19 ±11.26 / 3044.02 ms │  2996.76 / 3008.46 ±7.70 / 3020.16 ms │      no change │
│ QQuery 29 │        41.90 / 44.36 ±4.31 / 52.98 ms │        42.36 / 45.73 ±6.42 / 58.56 ms │      no change │
│ QQuery 30 │     306.07 / 311.00 ±5.16 / 320.85 ms │     307.37 / 310.67 ±2.49 / 314.05 ms │      no change │
│ QQuery 31 │     302.83 / 308.16 ±4.06 / 314.80 ms │     368.16 / 376.79 ±7.72 / 389.59 ms │   1.22x slower │
│ QQuery 32 │  1005.41 / 1008.41 ±2.13 / 1011.22 ms │    944.08 / 969.28 ±15.96 / 990.45 ms │      no change │
│ QQuery 33 │ 1426.03 / 1455.04 ±36.23 / 1520.05 ms │ 1425.42 / 1445.51 ±14.27 / 1469.63 ms │      no change │
│ QQuery 34 │  1432.79 / 1441.62 ±8.34 / 1455.94 ms │ 1425.70 / 1441.32 ±11.12 / 1459.93 ms │      no change │
│ QQuery 35 │    286.24 / 309.54 ±28.69 / 362.33 ms │    290.83 / 318.96 ±23.62 / 349.34 ms │      no change │
│ QQuery 36 │        63.23 / 70.50 ±7.57 / 84.17 ms │       60.10 / 71.26 ±11.01 / 90.73 ms │      no change │
│ QQuery 37 │        35.01 / 37.47 ±2.51 / 41.14 ms │        35.05 / 38.78 ±5.20 / 48.56 ms │      no change │
│ QQuery 38 │        40.35 / 44.84 ±2.29 / 46.74 ms │        35.64 / 38.73 ±1.75 / 40.53 ms │  +1.16x faster │
│ QQuery 39 │     119.42 / 132.07 ±8.25 / 141.42 ms │     121.78 / 125.67 ±2.30 / 128.99 ms │      no change │
│ QQuery 40 │        13.87 / 14.25 ±0.31 / 14.81 ms │        18.03 / 18.30 ±0.21 / 18.67 ms │   1.28x slower │
│ QQuery 41 │        13.47 / 17.54 ±7.56 / 32.66 ms │        17.73 / 22.47 ±6.42 / 35.19 ms │   1.28x slower │
│ QQuery 42 │        12.92 / 15.36 ±4.45 / 24.26 ms │        16.11 / 16.37 ±0.20 / 16.66 ms │   1.07x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 20076.35ms │
│ Total Time (adaptive-filters-in-decoder)   │ 16627.98ms │
│ Average Time (HEAD)                        │   466.89ms │
│ Average Time (adaptive-filters-in-decoder) │   386.70ms │
│ Queries Faster                             │          7 │
│ Queries Slower                             │          9 │
│ Queries with No Change                     │         27 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 29.6 GiB
Avg memory 22.7 GiB
CPU user 1059.7s
CPU sys 64.9s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 85.0s
Peak memory 30.2 GiB
Avg memory 23.1 GiB
CPU user 876.4s
CPU sys 52.5s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.19 / 4.77 ±6.99 / 18.75 ms │          1.25 / 4.81 ±6.96 / 18.72 ms │      no change │
│ QQuery 1  │        12.76 / 13.06 ±0.23 / 13.38 ms │        14.10 / 14.35 ±0.22 / 14.65 ms │   1.10x slower │
│ QQuery 2  │        36.75 / 37.40 ±0.41 / 37.93 ms │        36.63 / 36.90 ±0.26 / 37.29 ms │      no change │
│ QQuery 3  │        32.36 / 33.61 ±1.09 / 35.02 ms │        30.49 / 30.96 ±0.44 / 31.77 ms │  +1.09x faster │
│ QQuery 4  │     260.42 / 268.05 ±5.61 / 274.39 ms │     243.36 / 252.41 ±6.68 / 261.91 ms │  +1.06x faster │
│ QQuery 5  │     284.74 / 293.63 ±7.04 / 304.91 ms │     280.89 / 288.59 ±7.47 / 298.11 ms │      no change │
│ QQuery 6  │           6.38 / 7.02 ±0.41 / 7.53 ms │           5.24 / 6.25 ±1.03 / 8.14 ms │  +1.12x faster │
│ QQuery 7  │        13.62 / 13.76 ±0.10 / 13.92 ms │        15.10 / 15.17 ±0.04 / 15.20 ms │   1.10x slower │
│ QQuery 8  │    326.62 / 338.68 ±11.06 / 353.43 ms │     318.32 / 323.82 ±4.17 / 328.82 ms │      no change │
│ QQuery 9  │     448.24 / 457.14 ±7.80 / 467.85 ms │     446.64 / 458.96 ±7.94 / 471.33 ms │      no change │
│ QQuery 10 │        73.96 / 74.26 ±0.27 / 74.66 ms │        70.20 / 72.22 ±2.16 / 75.95 ms │      no change │
│ QQuery 11 │        85.04 / 87.96 ±2.16 / 90.03 ms │        82.75 / 83.88 ±0.77 / 84.78 ms │      no change │
│ QQuery 12 │     281.00 / 291.37 ±8.35 / 299.04 ms │    266.30 / 283.05 ±13.65 / 304.49 ms │      no change │
│ QQuery 13 │    402.25 / 418.31 ±16.68 / 446.80 ms │    416.34 / 432.14 ±17.44 / 462.03 ms │      no change │
│ QQuery 14 │    285.03 / 298.97 ±15.13 / 322.30 ms │     280.80 / 290.50 ±5.91 / 298.67 ms │      no change │
│ QQuery 15 │     282.17 / 292.50 ±9.50 / 305.44 ms │     306.60 / 312.40 ±7.10 / 325.71 ms │   1.07x slower │
│ QQuery 16 │    622.88 / 643.47 ±10.99 / 653.44 ms │    627.80 / 647.14 ±14.05 / 665.06 ms │      no change │
│ QQuery 17 │    630.67 / 647.58 ±13.01 / 664.12 ms │    602.01 / 626.90 ±16.78 / 645.45 ms │      no change │
│ QQuery 18 │ 1267.60 / 1295.85 ±21.21 / 1328.07 ms │ 1241.85 / 1275.73 ±26.01 / 1322.25 ms │      no change │
│ QQuery 19 │        29.52 / 31.67 ±3.97 / 39.61 ms │        28.20 / 28.32 ±0.18 / 28.67 ms │  +1.12x faster │
│ QQuery 20 │     525.97 / 535.61 ±8.83 / 548.98 ms │     522.48 / 530.52 ±7.72 / 542.03 ms │      no change │
│ QQuery 21 │     592.84 / 602.77 ±9.13 / 617.04 ms │    622.02 / 631.65 ±11.29 / 652.66 ms │      no change │
│ QQuery 22 │  1067.94 / 1080.80 ±8.73 / 1091.32 ms │     894.85 / 898.91 ±3.31 / 901.79 ms │  +1.20x faster │
│ QQuery 23 │ 3360.42 / 3415.79 ±35.89 / 3460.21 ms │     174.98 / 176.47 ±1.11 / 177.95 ms │ +19.36x faster │
│ QQuery 24 │        41.74 / 44.25 ±2.79 / 49.57 ms │        32.39 / 37.08 ±7.67 / 52.27 ms │  +1.19x faster │
│ QQuery 25 │     117.13 / 120.12 ±2.71 / 123.81 ms │     118.27 / 122.33 ±5.55 / 132.83 ms │      no change │
│ QQuery 26 │        42.30 / 42.62 ±0.35 / 43.06 ms │        51.37 / 52.27 ±1.04 / 54.29 ms │   1.23x slower │
│ QQuery 27 │     680.83 / 693.13 ±9.52 / 706.60 ms │     641.99 / 648.55 ±4.87 / 654.45 ms │  +1.07x faster │
│ QQuery 28 │ 3045.38 / 3060.79 ±10.81 / 3078.82 ms │ 3043.33 / 3062.02 ±20.24 / 3100.30 ms │      no change │
│ QQuery 29 │        42.28 / 43.91 ±1.76 / 47.03 ms │        42.03 / 46.44 ±5.39 / 54.02 ms │   1.06x slower │
│ QQuery 30 │    311.69 / 325.23 ±12.89 / 349.05 ms │     313.54 / 323.39 ±7.05 / 331.12 ms │      no change │
│ QQuery 31 │    298.90 / 314.94 ±12.80 / 331.23 ms │     369.68 / 381.73 ±8.94 / 395.42 ms │   1.21x slower │
│ QQuery 32 │ 1039.14 / 1078.30 ±33.34 / 1132.90 ms │     973.32 / 982.42 ±9.35 / 998.43 ms │  +1.10x faster │
│ QQuery 33 │ 1490.16 / 1537.38 ±41.14 / 1599.57 ms │ 1468.20 / 1501.12 ±23.99 / 1542.53 ms │      no change │
│ QQuery 34 │ 1504.35 / 1534.35 ±25.18 / 1567.69 ms │ 1482.56 / 1513.00 ±28.93 / 1552.08 ms │      no change │
│ QQuery 35 │    291.22 / 322.86 ±28.72 / 370.27 ms │    312.93 / 350.94 ±28.60 / 400.98 ms │   1.09x slower │
│ QQuery 36 │        64.24 / 67.40 ±3.28 / 73.26 ms │        63.15 / 68.75 ±4.44 / 76.68 ms │      no change │
│ QQuery 37 │        35.65 / 38.55 ±3.56 / 45.48 ms │        35.48 / 42.39 ±5.99 / 51.89 ms │   1.10x slower │
│ QQuery 38 │        44.25 / 46.50 ±2.49 / 51.04 ms │        39.30 / 45.27 ±4.49 / 50.97 ms │      no change │
│ QQuery 39 │     132.11 / 137.60 ±4.51 / 141.76 ms │     119.16 / 126.74 ±3.89 / 129.63 ms │  +1.09x faster │
│ QQuery 40 │        15.86 / 17.36 ±2.42 / 22.15 ms │        18.27 / 20.02 ±1.36 / 21.99 ms │   1.15x slower │
│ QQuery 41 │        15.11 / 16.43 ±2.10 / 20.55 ms │        17.72 / 22.49 ±6.58 / 34.67 ms │   1.37x slower │
│ QQuery 42 │        13.40 / 14.19 ±0.52 / 14.65 ms │        17.70 / 17.89 ±0.20 / 18.28 ms │   1.26x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 20639.94ms │
│ Total Time (adaptive-filters-in-decoder)   │ 17086.88ms │
│ Average Time (HEAD)                        │   480.00ms │
│ Average Time (adaptive-filters-in-decoder) │   397.37ms │
│ Queries Faster                             │         10 │
│ Queries Slower                             │         11 │
│ Queries with No Change                     │         22 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 30.8 GiB
Avg memory 23.2 GiB
CPU user 1091.4s
CPU sys 65.2s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 90.0s
Peak memory 30.6 GiB
Avg memory 23.4 GiB
CPU user 905.1s
CPU sys 52.5s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.21 / 4.61 ±6.73 / 18.06 ms │          1.18 / 4.56 ±6.70 / 17.95 ms │      no change │
│ QQuery 1  │        12.49 / 12.68 ±0.13 / 12.87 ms │        13.98 / 14.10 ±0.09 / 14.25 ms │   1.11x slower │
│ QQuery 2  │        36.47 / 37.20 ±0.68 / 38.36 ms │        36.59 / 37.07 ±0.39 / 37.69 ms │      no change │
│ QQuery 3  │        31.40 / 32.24 ±1.06 / 34.28 ms │        30.72 / 31.17 ±0.52 / 32.15 ms │      no change │
│ QQuery 4  │     239.53 / 244.98 ±3.29 / 248.55 ms │     246.25 / 247.79 ±1.06 / 249.26 ms │      no change │
│ QQuery 5  │     284.91 / 285.66 ±0.68 / 286.71 ms │     279.94 / 281.87 ±1.92 / 285.32 ms │      no change │
│ QQuery 6  │           6.40 / 7.26 ±0.61 / 7.95 ms │           4.99 / 5.62 ±0.42 / 6.23 ms │  +1.29x faster │
│ QQuery 7  │        13.56 / 14.06 ±0.85 / 15.76 ms │        14.92 / 16.10 ±2.06 / 20.22 ms │   1.14x slower │
│ QQuery 8  │     326.62 / 330.26 ±1.98 / 332.27 ms │     320.83 / 325.88 ±5.28 / 335.59 ms │      no change │
│ QQuery 9  │     449.83 / 457.29 ±4.11 / 461.71 ms │     447.15 / 460.16 ±9.20 / 472.67 ms │      no change │
│ QQuery 10 │        73.67 / 74.28 ±0.62 / 75.16 ms │        70.78 / 75.62 ±6.14 / 87.23 ms │      no change │
│ QQuery 11 │        85.21 / 85.33 ±0.16 / 85.60 ms │        81.18 / 81.69 ±0.50 / 82.58 ms │      no change │
│ QQuery 12 │     275.61 / 279.67 ±3.02 / 283.56 ms │     265.39 / 274.39 ±6.18 / 282.22 ms │      no change │
│ QQuery 13 │    392.72 / 403.19 ±11.26 / 424.34 ms │     411.53 / 420.21 ±7.59 / 434.25 ms │      no change │
│ QQuery 14 │     287.35 / 290.17 ±1.64 / 291.69 ms │     273.87 / 280.49 ±4.28 / 287.03 ms │      no change │
│ QQuery 15 │     286.07 / 290.72 ±3.74 / 296.58 ms │     283.13 / 289.79 ±4.77 / 297.19 ms │      no change │
│ QQuery 16 │     623.85 / 628.23 ±3.69 / 634.93 ms │     599.89 / 609.85 ±7.34 / 620.52 ms │      no change │
│ QQuery 17 │     625.16 / 633.36 ±6.95 / 643.87 ms │     605.71 / 612.02 ±6.74 / 624.41 ms │      no change │
│ QQuery 18 │  1270.84 / 1279.56 ±9.61 / 1296.39 ms │  1198.72 / 1213.39 ±8.68 / 1224.35 ms │  +1.05x faster │
│ QQuery 19 │        28.31 / 28.48 ±0.15 / 28.71 ms │       28.02 / 40.10 ±17.07 / 72.61 ms │   1.41x slower │
│ QQuery 20 │     520.67 / 527.24 ±8.30 / 543.21 ms │     520.35 / 526.09 ±6.71 / 539.01 ms │      no change │
│ QQuery 21 │     600.13 / 606.54 ±4.47 / 611.53 ms │    599.71 / 631.20 ±19.09 / 657.11 ms │      no change │
│ QQuery 22 │  1063.69 / 1073.18 ±8.10 / 1086.96 ms │     883.79 / 893.03 ±5.86 / 900.37 ms │  +1.20x faster │
│ QQuery 23 │ 3332.96 / 3374.34 ±36.37 / 3424.53 ms │     164.15 / 166.96 ±2.63 / 171.35 ms │ +20.21x faster │
│ QQuery 24 │        41.68 / 42.22 ±0.64 / 43.37 ms │        31.51 / 31.81 ±0.43 / 32.63 ms │  +1.33x faster │
│ QQuery 25 │     112.97 / 115.79 ±3.18 / 121.54 ms │     115.77 / 119.01 ±3.63 / 125.78 ms │      no change │
│ QQuery 26 │        42.29 / 43.93 ±3.04 / 50.00 ms │        50.26 / 51.25 ±0.85 / 52.66 ms │   1.17x slower │
│ QQuery 27 │     669.46 / 680.84 ±6.76 / 688.42 ms │     643.65 / 649.77 ±5.31 / 659.32 ms │      no change │
│ QQuery 28 │ 2993.41 / 3012.70 ±19.37 / 3049.13 ms │ 2989.92 / 3011.22 ±21.44 / 3051.08 ms │      no change │
│ QQuery 29 │       42.48 / 53.77 ±15.82 / 83.84 ms │        41.87 / 43.13 ±1.31 / 45.60 ms │  +1.25x faster │
│ QQuery 30 │     308.16 / 316.17 ±5.91 / 324.30 ms │     310.02 / 314.47 ±2.70 / 317.91 ms │      no change │
│ QQuery 31 │     300.39 / 308.18 ±6.18 / 315.78 ms │     373.81 / 377.23 ±2.47 / 380.03 ms │   1.22x slower │
│ QQuery 32 │ 1008.34 / 1019.33 ±12.67 / 1044.13 ms │     971.53 / 978.42 ±7.61 / 992.61 ms │      no change │
│ QQuery 33 │ 1452.13 / 1478.92 ±43.09 / 1564.53 ms │ 1416.16 / 1441.89 ±14.37 / 1457.65 ms │      no change │
│ QQuery 34 │ 1435.31 / 1456.11 ±19.61 / 1482.06 ms │ 1432.00 / 1450.60 ±14.42 / 1469.74 ms │      no change │
│ QQuery 35 │     288.04 / 296.99 ±7.28 / 307.83 ms │     291.69 / 295.99 ±3.33 / 299.44 ms │      no change │
│ QQuery 36 │        68.70 / 72.88 ±3.78 / 79.81 ms │        60.84 / 70.88 ±6.74 / 81.93 ms │      no change │
│ QQuery 37 │        35.18 / 35.65 ±0.27 / 35.93 ms │        35.22 / 39.37 ±3.27 / 44.33 ms │   1.10x slower │
│ QQuery 38 │        40.37 / 43.95 ±4.23 / 51.44 ms │        36.28 / 41.16 ±4.78 / 50.18 ms │  +1.07x faster │
│ QQuery 39 │     122.26 / 135.46 ±8.30 / 143.86 ms │     123.20 / 130.85 ±7.11 / 141.26 ms │      no change │
│ QQuery 40 │        14.38 / 14.59 ±0.21 / 14.86 ms │        18.40 / 19.22 ±1.08 / 21.31 ms │   1.32x slower │
│ QQuery 41 │        13.68 / 15.74 ±3.81 / 23.36 ms │        17.20 / 20.03 ±4.99 / 29.99 ms │   1.27x slower │
│ QQuery 42 │        13.11 / 13.54 ±0.33 / 14.10 ms │        16.12 / 16.34 ±0.21 / 16.63 ms │   1.21x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 20157.29ms │
│ Total Time (adaptive-filters-in-decoder)   │ 16641.81ms │
│ Average Time (HEAD)                        │   468.77ms │
│ Average Time (adaptive-filters-in-decoder) │   387.02ms │
│ Queries Faster                             │          7 │
│ Queries Slower                             │          9 │
│ Queries with No Change                     │         27 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 31.9 GiB
Avg memory 23.3 GiB
CPU user 1066.7s
CPU sys 63.8s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 85.0s
Peak memory 30.0 GiB
Avg memory 23.1 GiB
CPU user 882.0s
CPU sys 50.4s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.23 / 4.83 ±7.02 / 18.87 ms │          1.23 / 4.69 ±6.81 / 18.31 ms │      no change │
│ QQuery 1  │        12.48 / 13.98 ±0.85 / 14.71 ms │        12.92 / 13.05 ±0.10 / 13.20 ms │  +1.07x faster │
│ QQuery 2  │        37.27 / 37.90 ±0.39 / 38.35 ms │        36.66 / 36.94 ±0.26 / 37.28 ms │      no change │
│ QQuery 3  │        32.76 / 34.31 ±2.14 / 38.47 ms │        32.05 / 32.44 ±0.37 / 33.11 ms │  +1.06x faster │
│ QQuery 4  │     250.10 / 256.40 ±5.38 / 263.49 ms │     246.74 / 251.11 ±3.86 / 257.52 ms │      no change │
│ QQuery 5  │     291.75 / 295.35 ±2.41 / 298.79 ms │     291.14 / 292.81 ±1.54 / 295.47 ms │      no change │
│ QQuery 6  │           6.41 / 7.14 ±0.53 / 7.71 ms │           5.76 / 6.29 ±0.41 / 6.96 ms │  +1.13x faster │
│ QQuery 7  │        13.81 / 14.08 ±0.21 / 14.45 ms │        16.33 / 16.47 ±0.10 / 16.60 ms │   1.17x slower │
│ QQuery 8  │     340.87 / 344.84 ±3.90 / 351.38 ms │     332.46 / 335.11 ±2.32 / 337.98 ms │      no change │
│ QQuery 9  │     472.23 / 480.25 ±5.81 / 489.75 ms │     455.12 / 462.92 ±5.53 / 469.55 ms │      no change │
│ QQuery 10 │        75.57 / 78.30 ±4.03 / 86.31 ms │     100.90 / 103.46 ±2.52 / 107.92 ms │   1.32x slower │
│ QQuery 11 │        87.00 / 87.85 ±0.64 / 88.61 ms │     109.74 / 110.77 ±0.73 / 111.79 ms │   1.26x slower │
│ QQuery 12 │     289.05 / 294.83 ±4.34 / 299.92 ms │     314.06 / 319.03 ±3.75 / 325.42 ms │   1.08x slower │
│ QQuery 13 │     409.90 / 414.68 ±4.09 / 420.61 ms │     449.47 / 461.09 ±9.06 / 477.08 ms │   1.11x slower │
│ QQuery 14 │     293.21 / 299.02 ±4.78 / 306.53 ms │     330.79 / 333.99 ±1.83 / 336.47 ms │   1.12x slower │
│ QQuery 15 │    298.93 / 312.87 ±11.08 / 328.42 ms │     294.05 / 301.17 ±4.07 / 305.66 ms │      no change │
│ QQuery 16 │    646.86 / 662.68 ±16.82 / 693.56 ms │     640.75 / 645.96 ±2.93 / 648.59 ms │      no change │
│ QQuery 17 │     638.55 / 643.68 ±5.11 / 651.88 ms │     642.85 / 650.09 ±5.57 / 655.95 ms │      no change │
│ QQuery 18 │ 1275.38 / 1303.14 ±23.27 / 1343.18 ms │ 1298.34 / 1325.68 ±19.80 / 1358.71 ms │      no change │
│ QQuery 19 │        28.77 / 29.42 ±0.42 / 30.02 ms │        30.36 / 33.13 ±4.57 / 42.24 ms │   1.13x slower │
│ QQuery 20 │     527.38 / 534.87 ±9.48 / 552.32 ms │     530.00 / 540.32 ±6.84 / 549.15 ms │      no change │
│ QQuery 21 │     602.72 / 607.82 ±5.86 / 618.79 ms │     580.68 / 587.55 ±6.86 / 600.00 ms │      no change │
│ QQuery 22 │  1075.37 / 1086.32 ±8.30 / 1098.80 ms │     941.83 / 945.74 ±3.82 / 952.61 ms │  +1.15x faster │
│ QQuery 23 │ 3384.78 / 3439.23 ±35.51 / 3478.16 ms │     113.73 / 126.94 ±9.54 / 138.77 ms │ +27.09x faster │
│ QQuery 24 │        42.78 / 50.83 ±9.18 / 67.19 ms │        42.75 / 45.24 ±3.04 / 51.04 ms │  +1.12x faster │
│ QQuery 25 │     115.50 / 119.40 ±4.70 / 128.02 ms │     147.89 / 152.10 ±4.49 / 160.61 ms │   1.27x slower │
│ QQuery 26 │        43.08 / 43.57 ±0.76 / 45.08 ms │        62.73 / 64.77 ±2.69 / 70.06 ms │   1.49x slower │
│ QQuery 27 │     679.99 / 685.94 ±5.36 / 692.93 ms │     724.52 / 736.15 ±9.52 / 746.58 ms │   1.07x slower │
│ QQuery 28 │ 3031.77 / 3058.99 ±17.81 / 3082.91 ms │  3068.55 / 3085.01 ±9.71 / 3095.49 ms │      no change │
│ QQuery 29 │       42.47 / 48.28 ±10.45 / 69.18 ms │       42.96 / 51.51 ±11.94 / 73.79 ms │   1.07x slower │
│ QQuery 30 │     314.32 / 321.19 ±5.23 / 329.33 ms │    330.98 / 342.50 ±16.56 / 375.28 ms │   1.07x slower │
│ QQuery 31 │     313.94 / 320.91 ±5.43 / 330.44 ms │     317.35 / 325.22 ±6.73 / 335.42 ms │      no change │
│ QQuery 32 │ 1051.72 / 1073.12 ±14.04 / 1091.71 ms │ 1042.46 / 1082.17 ±25.58 / 1109.76 ms │      no change │
│ QQuery 33 │ 1480.19 / 1510.50 ±18.70 / 1527.86 ms │ 1481.35 / 1499.61 ±14.68 / 1519.35 ms │      no change │
│ QQuery 34 │ 1485.39 / 1515.94 ±27.24 / 1566.33 ms │ 1506.10 / 1533.57 ±26.77 / 1580.33 ms │      no change │
│ QQuery 35 │    300.10 / 310.74 ±12.78 / 334.19 ms │    297.51 / 309.30 ±14.46 / 337.17 ms │      no change │
│ QQuery 36 │      63.38 / 74.92 ±17.41 / 109.50 ms │        63.74 / 70.62 ±8.09 / 85.60 ms │  +1.06x faster │
│ QQuery 37 │        36.66 / 38.91 ±4.15 / 47.22 ms │        38.23 / 44.42 ±4.09 / 51.10 ms │   1.14x slower │
│ QQuery 38 │        44.38 / 46.89 ±4.59 / 56.06 ms │        36.97 / 38.77 ±1.25 / 40.28 ms │  +1.21x faster │
│ QQuery 39 │    129.15 / 144.14 ±12.56 / 163.36 ms │     122.24 / 128.02 ±5.52 / 137.45 ms │  +1.13x faster │
│ QQuery 40 │        14.65 / 15.04 ±0.30 / 15.56 ms │        19.19 / 22.59 ±3.61 / 29.37 ms │   1.50x slower │
│ QQuery 41 │        14.21 / 17.25 ±5.67 / 28.58 ms │        17.50 / 19.48 ±2.14 / 23.48 ms │   1.13x slower │
│ QQuery 42 │        13.97 / 14.30 ±0.26 / 14.73 ms │        14.76 / 15.40 ±0.37 / 15.79 ms │   1.08x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 20694.68ms │
│ Total Time (adaptive-filters-in-decoder)   │ 17503.21ms │
│ Average Time (HEAD)                        │   481.27ms │
│ Average Time (adaptive-filters-in-decoder) │   407.05ms │
│ Queries Faster                             │          9 │
│ Queries Slower                             │         16 │
│ Queries with No Change                     │         18 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 28.8 GiB
Avg memory 22.7 GiB
CPU user 1087.6s
CPU sys 70.0s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 90.0s
Peak memory 31.2 GiB
Avg memory 23.5 GiB
CPU user 921.4s
CPU sys 58.9s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Owner Author

run benchmarks

baseline:
    ref: main
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: true
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: true
changed:
    ref: HEAD
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: true
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: true

@adriangb
Copy link
Copy Markdown
Owner Author

run benchmarks

baseline:
    ref: main
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: false
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: false
changed:
    ref: HEAD
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: false
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: false

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4340678050-1897-l2m6f 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing HEAD (04e7aab) to main diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4340679035-1902-qzxvj 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing HEAD (04e7aab) to main diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4340679035-1900-6v9cj 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing HEAD (04e7aab) to main diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4340678050-1899-g9kbn 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing HEAD (04e7aab) to main diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4340678050-1898-6nkwh 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing HEAD (04e7aab) to main diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4340679035-1901-g4xsg 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing HEAD (04e7aab) to main diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃    adaptive-filters-in-decoder ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 40.30 / 41.13 ±1.14 / 43.31 ms │ 38.05 / 39.33 ±1.02 / 40.81 ms │     no change │
│ QQuery 2  │ 20.84 / 21.18 ±0.35 / 21.84 ms │ 20.81 / 21.06 ±0.24 / 21.39 ms │     no change │
│ QQuery 3  │ 39.13 / 40.75 ±0.91 / 41.85 ms │ 36.39 / 37.45 ±1.12 / 39.36 ms │ +1.09x faster │
│ QQuery 4  │ 18.17 / 18.58 ±0.57 / 19.70 ms │ 17.87 / 17.90 ±0.04 / 17.96 ms │     no change │
│ QQuery 5  │ 47.37 / 49.81 ±1.86 / 51.85 ms │ 47.13 / 48.42 ±1.14 / 50.29 ms │     no change │
│ QQuery 6  │ 16.98 / 17.34 ±0.29 / 17.84 ms │ 16.44 / 16.88 ±0.44 / 17.48 ms │     no change │
│ QQuery 7  │ 54.37 / 55.00 ±0.35 / 55.43 ms │ 53.01 / 55.55 ±1.40 / 57.17 ms │     no change │
│ QQuery 8  │ 47.39 / 47.81 ±0.26 / 48.21 ms │ 46.73 / 47.26 ±0.70 / 48.62 ms │     no change │
│ QQuery 9  │ 53.53 / 53.91 ±0.34 / 54.47 ms │ 52.29 / 53.05 ±0.70 / 53.95 ms │     no change │
│ QQuery 10 │ 65.29 / 65.87 ±0.56 / 66.92 ms │ 64.16 / 64.89 ±1.12 / 67.11 ms │     no change │
│ QQuery 11 │ 13.90 / 14.08 ±0.12 / 14.27 ms │ 13.86 / 13.98 ±0.11 / 14.16 ms │     no change │
│ QQuery 12 │ 27.28 / 27.69 ±0.53 / 28.68 ms │ 26.01 / 26.19 ±0.16 / 26.43 ms │ +1.06x faster │
│ QQuery 13 │ 37.40 / 38.06 ±0.44 / 38.58 ms │ 37.67 / 39.44 ±2.69 / 44.79 ms │     no change │
│ QQuery 14 │ 27.64 / 27.84 ±0.19 / 28.13 ms │ 27.06 / 27.28 ±0.20 / 27.56 ms │     no change │
│ QQuery 15 │ 32.94 / 33.62 ±1.19 / 36.00 ms │ 32.25 / 32.44 ±0.21 / 32.82 ms │     no change │
│ QQuery 16 │ 15.35 / 15.39 ±0.02 / 15.42 ms │ 15.12 / 15.23 ±0.10 / 15.41 ms │     no change │
│ QQuery 17 │ 77.96 / 79.61 ±1.15 / 81.40 ms │ 75.46 / 75.60 ±0.16 / 75.83 ms │ +1.05x faster │
│ QQuery 18 │ 74.38 / 74.97 ±0.63 / 76.00 ms │ 73.08 / 75.34 ±1.37 / 76.57 ms │     no change │
│ QQuery 19 │ 37.54 / 37.67 ±0.11 / 37.83 ms │ 35.10 / 35.50 ±0.67 / 36.84 ms │ +1.06x faster │
│ QQuery 20 │ 39.10 / 39.90 ±1.10 / 42.08 ms │ 38.55 / 38.88 ±0.37 / 39.58 ms │     no change │
│ QQuery 21 │ 63.22 / 65.84 ±2.69 / 70.74 ms │ 61.43 / 63.14 ±1.88 / 66.55 ms │     no change │
│ QQuery 22 │ 23.75 / 23.89 ±0.07 / 23.97 ms │ 23.50 / 23.96 ±0.57 / 25.06 ms │     no change │
└───────────┴────────────────────────────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                          ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 889.92ms │
│ Total Time (adaptive-filters-in-decoder)   │ 868.81ms │
│ Average Time (HEAD)                        │  40.45ms │
│ Average Time (adaptive-filters-in-decoder) │  39.49ms │
│ Queries Faster                             │        4 │
│ Queries Slower                             │        0 │
│ Queries with No Change                     │       18 │
│ Queries with Failure                       │        0 │
└────────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 5.0s
Peak memory 5.7 GiB
Avg memory 5.0 GiB
CPU user 33.4s
CPU sys 2.4s
Peak spill 0 B

tpch — branch

Metric Value
Wall time 5.0s
Peak memory 5.3 GiB
Avg memory 4.9 GiB
CPU user 32.5s
CPU sys 2.2s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                              HEAD ┃       adaptive-filters-in-decoder ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │    42.68 / 43.47 ±0.90 / 45.12 ms │    37.83 / 38.83 ±1.10 / 40.75 ms │ +1.12x faster │
│ QQuery 2  │    24.95 / 25.23 ±0.28 / 25.76 ms │    23.82 / 24.19 ±0.32 / 24.72 ms │     no change │
│ QQuery 3  │    48.07 / 49.74 ±1.67 / 52.68 ms │    55.68 / 56.26 ±0.70 / 57.54 ms │  1.13x slower │
│ QQuery 4  │    22.42 / 22.67 ±0.18 / 22.93 ms │    19.43 / 19.54 ±0.10 / 19.69 ms │ +1.16x faster │
│ QQuery 5  │    69.49 / 70.34 ±0.47 / 70.94 ms │    77.11 / 81.14 ±2.86 / 84.24 ms │  1.15x slower │
│ QQuery 6  │    35.72 / 36.20 ±0.42 / 36.78 ms │    15.87 / 16.07 ±0.10 / 16.15 ms │ +2.25x faster │
│ QQuery 7  │    57.66 / 57.89 ±0.18 / 58.22 ms │    76.52 / 78.22 ±2.05 / 82.18 ms │  1.35x slower │
│ QQuery 8  │    79.25 / 80.11 ±0.90 / 81.68 ms │    87.99 / 88.52 ±0.27 / 88.76 ms │  1.10x slower │
│ QQuery 9  │ 108.50 / 109.40 ±0.64 / 110.48 ms │ 145.56 / 147.92 ±2.24 / 151.65 ms │  1.35x slower │
│ QQuery 10 │    75.65 / 76.85 ±0.87 / 78.00 ms │    69.27 / 70.04 ±1.01 / 72.02 ms │ +1.10x faster │
│ QQuery 11 │    15.91 / 16.19 ±0.14 / 16.30 ms │    14.25 / 14.55 ±0.33 / 15.18 ms │ +1.11x faster │
│ QQuery 12 │    46.62 / 46.93 ±0.29 / 47.35 ms │    34.37 / 36.80 ±2.27 / 41.11 ms │ +1.28x faster │
│ QQuery 13 │    48.06 / 49.89 ±2.87 / 55.59 ms │    49.61 / 52.78 ±5.47 / 63.71 ms │  1.06x slower │
│ QQuery 14 │    40.56 / 41.53 ±0.97 / 43.21 ms │    40.14 / 40.86 ±0.46 / 41.55 ms │     no change │
│ QQuery 15 │    44.51 / 46.60 ±2.06 / 50.40 ms │    32.79 / 33.19 ±0.40 / 33.92 ms │ +1.40x faster │
│ QQuery 16 │    23.43 / 23.53 ±0.11 / 23.74 ms │    18.92 / 19.65 ±0.81 / 21.15 ms │ +1.20x faster │
│ QQuery 17 │ 156.57 / 157.59 ±1.08 / 159.23 ms │ 164.08 / 170.15 ±3.23 / 173.01 ms │  1.08x slower │
│ QQuery 18 │    75.25 / 75.99 ±0.78 / 77.51 ms │    79.96 / 81.82 ±2.04 / 85.00 ms │  1.08x slower │
│ QQuery 19 │    41.99 / 42.80 ±0.74 / 44.09 ms │    40.50 / 40.76 ±0.21 / 41.04 ms │     no change │
│ QQuery 20 │    45.72 / 45.96 ±0.21 / 46.31 ms │    73.11 / 73.73 ±0.40 / 74.33 ms │  1.60x slower │
│ QQuery 21 │    84.62 / 86.86 ±2.05 / 89.46 ms │    80.84 / 82.03 ±0.87 / 83.17 ms │ +1.06x faster │
│ QQuery 22 │    35.93 / 36.37 ±0.30 / 36.78 ms │    30.66 / 30.76 ±0.06 / 30.82 ms │ +1.18x faster │
└───────────┴───────────────────────────────────┴───────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 1242.13ms │
│ Total Time (adaptive-filters-in-decoder)   │ 1297.83ms │
│ Average Time (HEAD)                        │   56.46ms │
│ Average Time (adaptive-filters-in-decoder) │   58.99ms │
│ Queries Faster                             │        10 │
│ Queries Slower                             │         9 │
│ Queries with No Change                     │         3 │
│ Queries with Failure                       │         0 │
└────────────────────────────────────────────┴───────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 10.0s
Peak memory 5.6 GiB
Avg memory 4.9 GiB
CPU user 47.0s
CPU sys 2.5s
Peak spill 0 B

tpch — branch

Metric Value
Wall time 10.0s
Peak memory 5.6 GiB
Avg memory 4.8 GiB
CPU user 50.9s
CPU sys 2.2s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.21 / 4.61 ±6.75 / 18.10 ms │          1.20 / 4.60 ±6.69 / 17.98 ms │     no change │
│ QQuery 1  │        12.78 / 13.02 ±0.13 / 13.14 ms │        13.85 / 14.12 ±0.20 / 14.43 ms │  1.08x slower │
│ QQuery 2  │        36.49 / 36.93 ±0.35 / 37.52 ms │        36.77 / 36.89 ±0.14 / 37.15 ms │     no change │
│ QQuery 3  │        31.34 / 32.23 ±0.63 / 33.13 ms │        31.10 / 31.80 ±0.51 / 32.59 ms │     no change │
│ QQuery 4  │     249.93 / 254.70 ±2.97 / 258.98 ms │     238.72 / 246.15 ±4.18 / 250.05 ms │     no change │
│ QQuery 5  │     283.91 / 288.53 ±3.80 / 293.50 ms │     280.86 / 283.50 ±2.55 / 287.44 ms │     no change │
│ QQuery 6  │           5.66 / 6.12 ±0.40 / 6.80 ms │           5.28 / 6.08 ±0.94 / 7.84 ms │     no change │
│ QQuery 7  │        16.14 / 16.26 ±0.10 / 16.42 ms │        15.11 / 16.45 ±1.71 / 19.60 ms │     no change │
│ QQuery 8  │     326.97 / 332.34 ±4.90 / 340.42 ms │     320.17 / 325.28 ±3.61 / 329.45 ms │     no change │
│ QQuery 9  │     449.49 / 461.09 ±7.04 / 471.62 ms │     452.62 / 460.26 ±4.35 / 465.31 ms │     no change │
│ QQuery 10 │       98.53 / 99.17 ±0.60 / 100.25 ms │        71.74 / 72.32 ±0.68 / 73.58 ms │ +1.37x faster │
│ QQuery 11 │     108.00 / 109.18 ±0.82 / 110.40 ms │        83.25 / 85.39 ±3.21 / 91.70 ms │ +1.28x faster │
│ QQuery 12 │     312.91 / 315.81 ±3.34 / 322.31 ms │     267.76 / 271.00 ±2.77 / 275.62 ms │ +1.17x faster │
│ QQuery 13 │     435.52 / 440.75 ±2.91 / 444.37 ms │    412.30 / 425.64 ±17.03 / 455.48 ms │     no change │
│ QQuery 14 │     323.31 / 327.73 ±4.52 / 334.18 ms │     275.70 / 278.80 ±3.29 / 285.04 ms │ +1.18x faster │
│ QQuery 15 │     283.99 / 290.18 ±5.26 / 296.74 ms │     283.13 / 290.37 ±5.45 / 299.52 ms │     no change │
│ QQuery 16 │     618.28 / 629.68 ±6.51 / 636.66 ms │     606.63 / 612.89 ±4.42 / 619.56 ms │     no change │
│ QQuery 17 │     622.47 / 626.21 ±2.90 / 630.19 ms │    602.97 / 616.26 ±14.90 / 644.00 ms │     no change │
│ QQuery 18 │ 1258.66 / 1284.87 ±32.58 / 1346.96 ms │ 1203.69 / 1215.56 ±10.13 / 1228.61 ms │ +1.06x faster │
│ QQuery 19 │        30.48 / 32.90 ±4.32 / 41.54 ms │       28.14 / 38.77 ±18.42 / 75.53 ms │  1.18x slower │
│ QQuery 20 │     522.30 / 530.31 ±6.55 / 541.84 ms │     515.25 / 521.61 ±5.21 / 527.03 ms │     no change │
│ QQuery 21 │     571.40 / 579.33 ±5.82 / 588.55 ms │    625.79 / 639.61 ±10.52 / 656.06 ms │  1.10x slower │
│ QQuery 22 │    924.58 / 935.65 ±13.01 / 959.51 ms │     893.73 / 898.21 ±4.88 / 905.91 ms │     no change │
│ QQuery 23 │     119.87 / 123.70 ±4.86 / 133.24 ms │    170.43 / 182.04 ±11.78 / 203.12 ms │  1.47x slower │
│ QQuery 24 │        41.17 / 43.14 ±2.77 / 48.60 ms │        31.68 / 34.85 ±5.76 / 46.35 ms │ +1.24x faster │
│ QQuery 25 │     147.45 / 152.40 ±3.63 / 158.37 ms │     116.49 / 120.53 ±3.54 / 125.18 ms │ +1.26x faster │
│ QQuery 26 │        63.30 / 63.63 ±0.32 / 64.17 ms │        51.38 / 53.77 ±3.21 / 60.08 ms │ +1.18x faster │
│ QQuery 27 │     721.73 / 727.41 ±6.72 / 739.84 ms │     650.77 / 652.58 ±2.86 / 658.27 ms │ +1.11x faster │
│ QQuery 28 │ 3047.26 / 3060.98 ±12.67 / 3083.61 ms │ 2997.74 / 3009.14 ±12.22 / 3030.92 ms │     no change │
│ QQuery 29 │        43.38 / 49.12 ±6.91 / 59.59 ms │        42.22 / 48.41 ±7.47 / 58.76 ms │     no change │
│ QQuery 30 │     323.83 / 329.31 ±4.25 / 336.30 ms │     309.86 / 311.94 ±2.43 / 316.63 ms │ +1.06x faster │
│ QQuery 31 │     316.54 / 324.91 ±4.94 / 330.08 ms │     366.86 / 372.95 ±3.72 / 376.72 ms │  1.15x slower │
│ QQuery 32 │  1016.33 / 1024.84 ±7.44 / 1038.65 ms │   957.64 / 977.98 ±15.44 / 1002.51 ms │     no change │
│ QQuery 33 │  1475.53 / 1482.59 ±5.78 / 1492.67 ms │ 1425.41 / 1439.39 ±12.03 / 1454.39 ms │     no change │
│ QQuery 34 │ 1464.45 / 1486.79 ±15.15 / 1505.84 ms │ 1425.78 / 1442.65 ±14.45 / 1460.61 ms │     no change │
│ QQuery 35 │     308.38 / 313.30 ±7.63 / 328.42 ms │    292.77 / 314.09 ±22.39 / 342.84 ms │     no change │
│ QQuery 36 │        66.83 / 71.08 ±3.70 / 76.38 ms │        62.23 / 66.68 ±5.97 / 77.78 ms │ +1.07x faster │
│ QQuery 37 │        37.22 / 41.16 ±7.24 / 55.63 ms │        34.97 / 35.53 ±0.57 / 36.55 ms │ +1.16x faster │
│ QQuery 38 │        34.86 / 40.42 ±4.12 / 46.18 ms │        36.07 / 43.67 ±5.96 / 51.78 ms │  1.08x slower │
│ QQuery 39 │     121.99 / 128.06 ±5.35 / 136.77 ms │     122.78 / 128.93 ±7.23 / 142.23 ms │     no change │
│ QQuery 40 │        18.70 / 19.00 ±0.23 / 19.26 ms │        18.21 / 18.31 ±0.06 / 18.38 ms │     no change │
│ QQuery 41 │        17.14 / 17.96 ±1.26 / 20.46 ms │        17.07 / 17.26 ±0.13 / 17.43 ms │     no change │
│ QQuery 42 │        14.61 / 16.17 ±2.06 / 20.25 ms │        16.17 / 16.34 ±0.15 / 16.60 ms │     no change │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 17163.59ms │
│ Total Time (adaptive-filters-in-decoder)   │ 16678.59ms │
│ Average Time (HEAD)                        │   399.15ms │
│ Average Time (adaptive-filters-in-decoder) │   387.87ms │
│ Queries Faster                             │         12 │
│ Queries Slower                             │          6 │
│ Queries with No Change                     │         25 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 90.0s
Peak memory 30.2 GiB
Avg memory 23.6 GiB
CPU user 912.1s
CPU sys 52.7s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 85.0s
Peak memory 29.9 GiB
Avg memory 23.2 GiB
CPU user 881.6s
CPU sys 50.5s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.18 / 4.60 ±6.76 / 18.12 ms │          1.16 / 4.53 ±6.69 / 17.91 ms │     no change │
│ QQuery 1  │        12.21 / 12.42 ±0.12 / 12.56 ms │        12.72 / 13.06 ±0.19 / 13.28 ms │  1.05x slower │
│ QQuery 2  │        36.72 / 36.79 ±0.08 / 36.91 ms │        35.99 / 36.26 ±0.24 / 36.67 ms │     no change │
│ QQuery 3  │        31.30 / 31.97 ±0.60 / 32.85 ms │        30.61 / 31.04 ±0.33 / 31.63 ms │     no change │
│ QQuery 4  │     238.64 / 243.76 ±4.02 / 250.11 ms │     240.30 / 244.43 ±3.04 / 248.47 ms │     no change │
│ QQuery 5  │     282.34 / 283.95 ±0.99 / 285.34 ms │     278.25 / 280.19 ±1.52 / 282.64 ms │     no change │
│ QQuery 6  │           6.45 / 7.04 ±0.33 / 7.42 ms │           6.53 / 7.11 ±0.39 / 7.61 ms │     no change │
│ QQuery 7  │        13.59 / 14.26 ±1.13 / 16.51 ms │        13.96 / 14.37 ±0.63 / 15.60 ms │     no change │
│ QQuery 8  │     324.64 / 329.27 ±3.09 / 334.26 ms │     316.53 / 322.08 ±6.51 / 333.99 ms │     no change │
│ QQuery 9  │     450.53 / 457.77 ±5.52 / 465.45 ms │     443.77 / 450.29 ±3.52 / 453.44 ms │     no change │
│ QQuery 10 │        73.79 / 77.18 ±4.87 / 86.83 ms │        69.31 / 70.01 ±0.64 / 70.99 ms │ +1.10x faster │
│ QQuery 11 │        84.24 / 84.82 ±0.53 / 85.69 ms │        80.26 / 81.18 ±0.58 / 81.92 ms │     no change │
│ QQuery 12 │     274.30 / 280.79 ±6.03 / 289.82 ms │     272.47 / 274.49 ±2.99 / 280.31 ms │     no change │
│ QQuery 13 │     393.21 / 400.45 ±8.05 / 415.20 ms │     389.27 / 394.92 ±5.05 / 402.75 ms │     no change │
│ QQuery 14 │    287.32 / 295.29 ±14.19 / 323.65 ms │     278.68 / 282.93 ±7.08 / 297.04 ms │     no change │
│ QQuery 15 │     284.17 / 286.80 ±1.87 / 289.23 ms │    282.95 / 291.78 ±11.20 / 312.93 ms │     no change │
│ QQuery 16 │     614.28 / 623.78 ±6.87 / 630.77 ms │     602.36 / 607.40 ±4.96 / 616.30 ms │     no change │
│ QQuery 17 │     621.01 / 623.71 ±2.02 / 626.68 ms │     601.17 / 609.91 ±7.64 / 619.76 ms │     no change │
│ QQuery 18 │ 1248.30 / 1266.40 ±14.31 / 1290.67 ms │ 1190.04 / 1221.04 ±17.21 / 1242.26 ms │     no change │
│ QQuery 19 │        28.57 / 30.45 ±3.09 / 36.60 ms │        28.21 / 31.68 ±4.23 / 37.85 ms │     no change │
│ QQuery 20 │     516.65 / 526.81 ±8.70 / 541.99 ms │     512.89 / 517.65 ±4.15 / 524.47 ms │     no change │
│ QQuery 21 │     590.91 / 596.63 ±4.44 / 604.27 ms │     589.99 / 593.47 ±2.60 / 596.61 ms │     no change │
│ QQuery 22 │  1061.75 / 1075.59 ±8.35 / 1087.34 ms │  1043.67 / 1050.41 ±4.43 / 1057.41 ms │     no change │
│ QQuery 23 │  3327.96 / 3342.39 ±8.86 / 3355.77 ms │ 3124.93 / 3165.27 ±32.59 / 3209.89 ms │ +1.06x faster │
│ QQuery 24 │        42.32 / 43.81 ±1.79 / 47.15 ms │        41.57 / 42.55 ±0.93 / 44.25 ms │     no change │
│ QQuery 25 │     115.01 / 116.37 ±1.43 / 118.76 ms │     110.16 / 112.32 ±2.73 / 117.55 ms │     no change │
│ QQuery 26 │        42.91 / 43.89 ±1.40 / 46.66 ms │        42.29 / 43.06 ±0.77 / 44.32 ms │     no change │
│ QQuery 27 │     667.10 / 671.87 ±3.88 / 675.60 ms │     670.37 / 672.83 ±2.33 / 676.28 ms │     no change │
│ QQuery 28 │  3017.25 / 3028.65 ±9.64 / 3043.81 ms │  2990.46 / 2998.05 ±5.28 / 3006.41 ms │     no change │
│ QQuery 29 │        42.32 / 48.13 ±6.95 / 59.26 ms │        41.71 / 47.26 ±5.29 / 55.36 ms │     no change │
│ QQuery 30 │     309.43 / 313.44 ±5.58 / 324.49 ms │     296.39 / 301.91 ±3.67 / 307.79 ms │     no change │
│ QQuery 31 │     302.95 / 310.74 ±6.50 / 319.10 ms │     296.33 / 298.71 ±1.48 / 300.27 ms │     no change │
│ QQuery 32 │  1001.13 / 1010.74 ±5.11 / 1014.82 ms │     957.26 / 969.82 ±7.02 / 977.91 ms │     no change │
│ QQuery 33 │ 1420.40 / 1440.01 ±11.22 / 1453.44 ms │ 1407.00 / 1421.36 ±10.40 / 1431.74 ms │     no change │
│ QQuery 34 │ 1424.55 / 1468.79 ±30.95 / 1516.50 ms │  1435.31 / 1443.38 ±4.92 / 1447.47 ms │     no change │
│ QQuery 35 │     284.15 / 291.20 ±4.49 / 296.51 ms │    287.81 / 316.58 ±40.56 / 395.54 ms │  1.09x slower │
│ QQuery 36 │        62.17 / 65.54 ±4.51 / 73.95 ms │        63.46 / 68.81 ±3.31 / 73.67 ms │     no change │
│ QQuery 37 │        35.34 / 37.10 ±2.40 / 41.76 ms │        35.67 / 39.72 ±4.83 / 47.16 ms │  1.07x slower │
│ QQuery 38 │        39.89 / 44.98 ±3.12 / 49.20 ms │        39.83 / 42.68 ±3.07 / 48.28 ms │ +1.05x faster │
│ QQuery 39 │     123.29 / 134.17 ±7.05 / 144.23 ms │     128.73 / 133.39 ±2.64 / 136.55 ms │     no change │
│ QQuery 40 │        14.56 / 16.60 ±3.42 / 23.41 ms │        13.79 / 14.10 ±0.29 / 14.62 ms │ +1.18x faster │
│ QQuery 41 │        13.96 / 15.49 ±1.42 / 17.47 ms │        13.55 / 16.27 ±3.33 / 21.06 ms │  1.05x slower │
│ QQuery 42 │        13.44 / 14.49 ±1.73 / 17.95 ms │        12.91 / 13.09 ±0.19 / 13.45 ms │ +1.11x faster │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 20048.92ms │
│ Total Time (adaptive-filters-in-decoder)   │ 19591.39ms │
│ Average Time (HEAD)                        │   466.25ms │
│ Average Time (adaptive-filters-in-decoder) │   455.61ms │
│ Queries Faster                             │          5 │
│ Queries Slower                             │          4 │
│ Queries with No Change                     │         34 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 30.6 GiB
Avg memory 23.2 GiB
CPU user 1060.7s
CPU sys 64.2s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 100.0s
Peak memory 29.2 GiB
Avg memory 23.2 GiB
CPU user 1033.9s
CPU sys 63.7s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃              adaptive-filters-in-decoder ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │              7.06 / 7.54 ±0.76 / 9.04 ms │              7.15 / 7.57 ±0.78 / 9.13 ms │     no change │
│ QQuery 2  │        145.00 / 145.69 ±0.51 / 146.53 ms │        144.29 / 144.41 ±0.07 / 144.50 ms │     no change │
│ QQuery 3  │        113.29 / 114.05 ±0.86 / 115.67 ms │        111.12 / 111.71 ±0.59 / 112.51 ms │     no change │
│ QQuery 4  │     1225.77 / 1241.71 ±9.15 / 1250.79 ms │     1235.65 / 1247.90 ±8.46 / 1259.72 ms │     no change │
│ QQuery 5  │        173.33 / 174.30 ±0.97 / 175.94 ms │        171.29 / 172.18 ±0.92 / 173.71 ms │     no change │
│ QQuery 6  │        129.29 / 131.27 ±1.06 / 132.24 ms │        125.65 / 127.12 ±1.65 / 130.24 ms │     no change │
│ QQuery 7  │        334.53 / 335.75 ±1.08 / 337.68 ms │        325.50 / 327.62 ±2.55 / 332.47 ms │     no change │
│ QQuery 8  │        113.64 / 114.52 ±1.23 / 116.97 ms │        111.04 / 111.82 ±0.61 / 112.84 ms │     no change │
│ QQuery 9  │          90.93 / 97.43 ±3.92 / 101.61 ms │          93.76 / 97.11 ±2.61 / 100.43 ms │     no change │
│ QQuery 10 │        103.44 / 104.06 ±0.45 / 104.71 ms │           94.52 / 96.24 ±1.62 / 99.29 ms │ +1.08x faster │
│ QQuery 11 │        842.14 / 852.83 ±7.03 / 864.23 ms │        837.18 / 846.21 ±5.75 / 853.74 ms │     no change │
│ QQuery 12 │           42.15 / 42.95 ±0.50 / 43.66 ms │           42.25 / 42.93 ±0.56 / 43.89 ms │     no change │
│ QQuery 13 │        385.12 / 388.41 ±2.76 / 393.43 ms │        379.97 / 383.97 ±2.67 / 387.44 ms │     no change │
│ QQuery 14 │     1024.68 / 1029.55 ±2.91 / 1033.29 ms │     1012.15 / 1016.71 ±2.76 / 1019.10 ms │     no change │
│ QQuery 15 │           14.47 / 14.68 ±0.33 / 15.34 ms │           14.28 / 14.66 ±0.38 / 15.21 ms │     no change │
│ QQuery 16 │              7.26 / 7.35 ±0.13 / 7.60 ms │              7.24 / 7.31 ±0.09 / 7.48 ms │     no change │
│ QQuery 17 │        221.00 / 222.68 ±1.34 / 224.34 ms │        214.64 / 216.29 ±1.20 / 217.83 ms │     no change │
│ QQuery 18 │        125.01 / 126.30 ±1.04 / 127.54 ms │        121.79 / 122.48 ±0.70 / 123.79 ms │     no change │
│ QQuery 19 │        153.66 / 154.74 ±1.01 / 156.62 ms │        149.78 / 150.70 ±0.88 / 152.24 ms │     no change │
│ QQuery 20 │           12.72 / 13.23 ±0.34 / 13.65 ms │           13.14 / 13.35 ±0.13 / 13.50 ms │     no change │
│ QQuery 21 │           18.81 / 19.44 ±0.46 / 20.01 ms │           18.47 / 18.72 ±0.25 / 19.18 ms │     no change │
│ QQuery 22 │        472.84 / 477.52 ±4.13 / 483.23 ms │        466.05 / 470.54 ±4.43 / 478.12 ms │     no change │
│ QQuery 23 │     1077.60 / 1086.78 ±8.27 / 1100.60 ms │     1060.83 / 1066.08 ±3.55 / 1070.14 ms │     no change │
│ QQuery 24 │        704.63 / 707.27 ±1.93 / 710.47 ms │        692.62 / 695.01 ±2.85 / 700.44 ms │     no change │
│ QQuery 25 │        329.95 / 334.16 ±3.89 / 339.18 ms │        326.59 / 328.49 ±1.26 / 329.93 ms │     no change │
│ QQuery 26 │           77.59 / 78.47 ±1.26 / 80.95 ms │           73.25 / 74.93 ±1.92 / 78.64 ms │     no change │
│ QQuery 27 │              7.18 / 7.44 ±0.16 / 7.64 ms │              7.22 / 7.31 ±0.13 / 7.56 ms │     no change │
│ QQuery 28 │        147.49 / 147.93 ±0.56 / 148.99 ms │        144.18 / 146.47 ±3.02 / 152.24 ms │     no change │
│ QQuery 29 │        270.48 / 274.40 ±2.88 / 278.49 ms │        264.30 / 265.84 ±1.98 / 269.69 ms │     no change │
│ QQuery 30 │           41.24 / 41.78 ±0.50 / 42.40 ms │           40.14 / 40.72 ±0.49 / 41.58 ms │     no change │
│ QQuery 31 │        165.60 / 168.39 ±2.74 / 173.63 ms │        161.74 / 164.62 ±3.77 / 172.04 ms │     no change │
│ QQuery 32 │           13.11 / 13.40 ±0.23 / 13.73 ms │           13.33 / 13.58 ±0.24 / 13.98 ms │     no change │
│ QQuery 33 │        139.25 / 139.68 ±0.34 / 140.26 ms │        135.90 / 137.16 ±1.64 / 140.23 ms │     no change │
│ QQuery 34 │              6.87 / 7.02 ±0.19 / 7.39 ms │              7.16 / 7.30 ±0.18 / 7.65 ms │     no change │
│ QQuery 35 │        100.68 / 102.92 ±3.00 / 108.77 ms │           94.81 / 96.26 ±1.14 / 97.47 ms │ +1.07x faster │
│ QQuery 36 │              6.63 / 6.82 ±0.12 / 7.00 ms │              6.93 / 7.13 ±0.10 / 7.21 ms │     no change │
│ QQuery 37 │              8.14 / 8.33 ±0.15 / 8.56 ms │              8.21 / 8.40 ±0.14 / 8.55 ms │     no change │
│ QQuery 38 │           84.58 / 87.66 ±3.18 / 93.28 ms │           82.77 / 85.15 ±2.43 / 88.86 ms │     no change │
│ QQuery 39 │        116.62 / 117.52 ±0.72 / 118.68 ms │        116.27 / 116.75 ±0.32 / 117.12 ms │     no change │
│ QQuery 40 │        101.75 / 103.78 ±1.49 / 105.59 ms │        100.73 / 103.10 ±2.31 / 107.23 ms │     no change │
│ QQuery 41 │           14.28 / 14.50 ±0.24 / 14.94 ms │           14.15 / 14.40 ±0.33 / 15.05 ms │     no change │
│ QQuery 42 │        106.05 / 107.52 ±1.02 / 109.03 ms │        103.86 / 104.53 ±0.72 / 105.56 ms │     no change │
│ QQuery 43 │              5.67 / 5.77 ±0.13 / 6.02 ms │              5.77 / 5.91 ±0.16 / 6.22 ms │     no change │
│ QQuery 44 │           10.57 / 10.75 ±0.10 / 10.83 ms │           10.81 / 11.02 ±0.15 / 11.27 ms │     no change │
│ QQuery 45 │           47.88 / 49.18 ±1.41 / 51.77 ms │           47.72 / 48.45 ±1.11 / 50.60 ms │     no change │
│ QQuery 46 │              8.28 / 8.45 ±0.22 / 8.89 ms │              8.52 / 8.64 ±0.16 / 8.95 ms │     no change │
│ QQuery 47 │        662.75 / 674.47 ±7.01 / 681.76 ms │        656.73 / 669.50 ±6.95 / 677.81 ms │     no change │
│ QQuery 48 │        272.02 / 274.52 ±2.36 / 278.43 ms │        266.79 / 270.79 ±4.79 / 279.53 ms │     no change │
│ QQuery 49 │        246.64 / 249.01 ±2.10 / 251.70 ms │        243.78 / 245.97 ±2.15 / 249.66 ms │     no change │
│ QQuery 50 │        200.80 / 203.32 ±2.01 / 206.95 ms │        194.65 / 200.72 ±4.66 / 209.01 ms │     no change │
│ QQuery 51 │        174.53 / 177.24 ±1.73 / 179.30 ms │        173.70 / 174.73 ±1.28 / 177.21 ms │     no change │
│ QQuery 52 │        106.68 / 108.70 ±2.39 / 113.35 ms │        105.69 / 106.56 ±0.76 / 107.63 ms │     no change │
│ QQuery 53 │        101.69 / 102.57 ±0.49 / 103.05 ms │         98.07 / 100.62 ±3.06 / 106.62 ms │     no change │
│ QQuery 54 │        145.03 / 146.14 ±0.97 / 147.96 ms │        142.93 / 143.51 ±0.52 / 144.18 ms │     no change │
│ QQuery 55 │        107.22 / 107.51 ±0.37 / 108.10 ms │        103.43 / 104.71 ±1.16 / 106.82 ms │     no change │
│ QQuery 56 │        138.94 / 140.21 ±0.93 / 141.54 ms │        136.40 / 137.55 ±0.97 / 138.82 ms │     no change │
│ QQuery 57 │        162.83 / 164.61 ±2.20 / 168.91 ms │        164.00 / 166.82 ±2.94 / 172.08 ms │     no change │
│ QQuery 58 │        262.56 / 264.37 ±1.10 / 266.03 ms │        262.37 / 265.08 ±1.85 / 267.91 ms │     no change │
│ QQuery 59 │        193.25 / 195.04 ±1.65 / 197.80 ms │        188.42 / 192.45 ±3.50 / 198.84 ms │     no change │
│ QQuery 60 │        140.02 / 141.45 ±0.72 / 142.01 ms │        136.78 / 138.00 ±1.20 / 140.06 ms │     no change │
│ QQuery 61 │           13.39 / 13.57 ±0.17 / 13.90 ms │           13.50 / 13.67 ±0.17 / 13.98 ms │     no change │
│ QQuery 62 │        852.44 / 855.49 ±2.26 / 859.05 ms │        855.35 / 861.42 ±4.36 / 867.54 ms │     no change │
│ QQuery 63 │        102.89 / 104.25 ±1.92 / 108.05 ms │         98.73 / 100.10 ±1.43 / 102.72 ms │     no change │
│ QQuery 64 │        649.27 / 653.95 ±5.57 / 664.83 ms │        640.68 / 648.28 ±5.88 / 656.47 ms │     no change │
│ QQuery 65 │        238.35 / 243.89 ±3.53 / 248.94 ms │        234.80 / 239.35 ±4.95 / 248.86 ms │     no change │
│ QQuery 66 │        211.59 / 217.95 ±5.66 / 226.76 ms │        212.23 / 218.29 ±6.87 / 231.16 ms │     no change │
│ QQuery 67 │        291.97 / 299.45 ±8.22 / 310.26 ms │       287.81 / 302.30 ±18.22 / 332.40 ms │     no change │
│ QQuery 68 │              8.51 / 8.80 ±0.33 / 9.46 ms │              8.48 / 8.68 ±0.17 / 8.99 ms │     no change │
│ QQuery 69 │          98.18 / 99.43 ±0.80 / 100.72 ms │           90.11 / 90.67 ±0.31 / 91.01 ms │ +1.10x faster │
│ QQuery 70 │        311.73 / 319.10 ±4.23 / 322.95 ms │       305.62 / 323.15 ±14.81 / 339.61 ms │     no change │
│ QQuery 71 │        132.54 / 133.81 ±0.78 / 134.97 ms │        129.27 / 132.52 ±4.57 / 141.60 ms │     no change │
│ QQuery 72 │        576.77 / 584.79 ±5.04 / 592.27 ms │       567.58 / 586.68 ±11.29 / 597.46 ms │     no change │
│ QQuery 73 │              6.57 / 6.77 ±0.26 / 7.29 ms │              6.61 / 6.79 ±0.24 / 7.27 ms │     no change │
│ QQuery 74 │       525.82 / 542.59 ±11.00 / 557.71 ms │        528.21 / 533.55 ±5.77 / 542.49 ms │     no change │
│ QQuery 75 │        265.05 / 268.32 ±3.40 / 274.78 ms │        264.58 / 267.25 ±2.45 / 270.36 ms │     no change │
│ QQuery 76 │        129.15 / 131.38 ±2.52 / 136.07 ms │        125.58 / 126.81 ±0.83 / 128.12 ms │     no change │
│ QQuery 77 │        187.22 / 189.49 ±3.39 / 196.19 ms │        183.73 / 186.32 ±3.33 / 192.82 ms │     no change │
│ QQuery 78 │        322.62 / 329.12 ±3.51 / 332.17 ms │        322.36 / 323.86 ±1.46 / 326.42 ms │     no change │
│ QQuery 79 │        226.53 / 229.13 ±2.49 / 233.60 ms │        223.24 / 226.40 ±2.38 / 229.63 ms │     no change │
│ QQuery 80 │        317.27 / 320.51 ±2.84 / 325.61 ms │        314.25 / 316.53 ±3.19 / 322.79 ms │     no change │
│ QQuery 81 │           25.84 / 26.74 ±1.30 / 29.28 ms │           25.19 / 25.44 ±0.17 / 25.71 ms │     no change │
│ QQuery 82 │           39.09 / 39.99 ±0.48 / 40.48 ms │           37.52 / 37.91 ±0.43 / 38.74 ms │ +1.05x faster │
│ QQuery 83 │           37.05 / 37.22 ±0.12 / 37.39 ms │           36.50 / 36.72 ±0.25 / 37.20 ms │     no change │
│ QQuery 84 │           45.88 / 46.16 ±0.24 / 46.46 ms │           45.37 / 45.52 ±0.09 / 45.64 ms │     no change │
│ QQuery 85 │        141.61 / 142.66 ±0.77 / 143.64 ms │        134.45 / 137.22 ±2.67 / 141.66 ms │     no change │
│ QQuery 86 │           37.11 / 37.78 ±0.40 / 38.35 ms │           36.82 / 37.51 ±0.58 / 38.53 ms │     no change │
│ QQuery 87 │              3.54 / 3.62 ±0.10 / 3.79 ms │              3.58 / 3.65 ±0.09 / 3.83 ms │     no change │
│ QQuery 88 │         99.21 / 101.11 ±2.57 / 106.16 ms │          97.52 / 99.65 ±3.24 / 106.07 ms │     no change │
│ QQuery 89 │        115.42 / 116.54 ±0.95 / 117.92 ms │        113.22 / 115.06 ±1.07 / 116.25 ms │     no change │
│ QQuery 90 │           21.96 / 22.42 ±0.51 / 23.36 ms │           21.54 / 22.16 ±0.37 / 22.57 ms │     no change │
│ QQuery 91 │           58.03 / 60.04 ±1.80 / 63.01 ms │           55.71 / 56.93 ±1.33 / 59.43 ms │ +1.05x faster │
│ QQuery 92 │           55.53 / 56.09 ±0.32 / 56.49 ms │           54.39 / 55.05 ±0.54 / 55.74 ms │     no change │
│ QQuery 93 │        178.16 / 180.22 ±1.46 / 182.58 ms │        172.75 / 175.52 ±1.77 / 178.34 ms │     no change │
│ QQuery 94 │           59.52 / 60.14 ±0.41 / 60.65 ms │           58.73 / 59.05 ±0.24 / 59.42 ms │     no change │
│ QQuery 95 │        125.18 / 126.30 ±1.16 / 128.49 ms │        123.22 / 123.69 ±0.56 / 124.75 ms │     no change │
│ QQuery 96 │           67.76 / 69.19 ±0.98 / 70.66 ms │           66.88 / 67.53 ±0.43 / 68.01 ms │     no change │
│ QQuery 97 │        115.62 / 117.16 ±1.16 / 119.22 ms │        113.02 / 116.60 ±5.08 / 126.65 ms │     no change │
│ QQuery 98 │        148.03 / 149.48 ±1.42 / 151.68 ms │        147.14 / 148.68 ±1.56 / 150.94 ms │     no change │
│ QQuery 99 │ 10623.68 / 10703.08 ±93.59 / 10884.52 ms │ 10688.70 / 10752.04 ±33.67 / 10780.50 ms │     no change │
└───────────┴──────────────────────────────────────────┴──────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 30042.83ms │
│ Total Time (adaptive-filters-in-decoder)   │ 29850.34ms │
│ Average Time (HEAD)                        │   303.46ms │
│ Average Time (adaptive-filters-in-decoder) │   301.52ms │
│ Queries Faster                             │          5 │
│ Queries Slower                             │          0 │
│ Queries with No Change                     │         94 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 155.0s
Peak memory 6.4 GiB
Avg memory 5.6 GiB
CPU user 237.9s
CPU sys 7.5s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 150.0s
Peak memory 6.3 GiB
Avg memory 5.7 GiB
CPU user 236.8s
CPU sys 8.0s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                      HEAD ┃              adaptive-filters-in-decoder ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 1  │               6.67 / 7.19 ±0.86 / 8.91 ms │              6.63 / 7.31 ±0.91 / 9.12 ms │      no change │
│ QQuery 2  │        113.35 / 123.98 ±20.21 / 164.38 ms │        109.86 / 112.39 ±4.58 / 121.55 ms │  +1.10x faster │
│ QQuery 3  │         108.09 / 109.82 ±1.89 / 113.44 ms │        117.85 / 119.89 ±1.25 / 121.58 ms │   1.09x slower │
│ QQuery 4  │      1039.16 / 1044.45 ±4.34 / 1050.98 ms │    1129.85 / 1144.02 ±11.03 / 1156.49 ms │   1.10x slower │
│ QQuery 5  │         184.36 / 187.13 ±3.20 / 192.39 ms │        187.21 / 188.50 ±1.02 / 189.73 ms │      no change │
│ QQuery 6  │            88.01 / 88.93 ±0.63 / 89.69 ms │        123.08 / 128.16 ±2.99 / 132.03 ms │   1.44x slower │
│ QQuery 7  │         312.21 / 317.44 ±3.27 / 320.10 ms │        482.53 / 487.40 ±3.10 / 492.32 ms │   1.54x slower │
│ QQuery 8  │         148.36 / 150.91 ±4.06 / 158.98 ms │        142.13 / 145.74 ±4.27 / 154.10 ms │      no change │
│ QQuery 9  │        213.39 / 226.84 ±11.09 / 243.46 ms │         94.54 / 103.07 ±9.51 / 121.60 ms │  +2.20x faster │
│ QQuery 10 │         157.46 / 161.11 ±2.66 / 165.44 ms │        120.51 / 121.40 ±0.52 / 122.03 ms │  +1.33x faster │
│ QQuery 11 │        666.69 / 684.27 ±14.66 / 702.32 ms │       701.29 / 739.51 ±20.78 / 758.33 ms │   1.08x slower │
│ QQuery 12 │            35.19 / 35.59 ±0.42 / 36.29 ms │           37.71 / 37.89 ±0.15 / 38.09 ms │   1.06x slower │
│ QQuery 13 │         530.42 / 534.43 ±2.70 / 537.88 ms │        433.14 / 444.54 ±9.40 / 455.06 ms │  +1.20x faster │
│ QQuery 14 │      1000.95 / 1006.64 ±4.94 / 1015.03 ms │        948.53 / 958.42 ±5.58 / 965.12 ms │      no change │
│ QQuery 15 │            16.96 / 17.20 ±0.20 / 17.47 ms │           17.43 / 17.74 ±0.35 / 18.42 ms │      no change │
│ QQuery 16 │               7.00 / 7.13 ±0.16 / 7.46 ms │              7.14 / 7.25 ±0.15 / 7.56 ms │      no change │
│ QQuery 17 │         176.64 / 184.71 ±7.93 / 199.74 ms │        317.49 / 326.58 ±6.63 / 334.53 ms │   1.77x slower │
│ QQuery 18 │        363.12 / 396.68 ±20.00 / 418.39 ms │        315.17 / 320.60 ±3.37 / 324.82 ms │  +1.24x faster │
│ QQuery 19 │         130.22 / 133.16 ±2.67 / 137.94 ms │        166.88 / 174.06 ±5.22 / 181.30 ms │   1.31x slower │
│ QQuery 20 │            15.46 / 15.84 ±0.36 / 16.33 ms │           12.97 / 13.30 ±0.29 / 13.75 ms │  +1.19x faster │
│ QQuery 21 │            26.77 / 27.19 ±0.37 / 27.77 ms │           19.60 / 19.94 ±0.26 / 20.31 ms │  +1.36x faster │
│ QQuery 22 │        491.37 / 508.53 ±10.78 / 519.75 ms │       472.63 / 489.26 ±10.58 / 503.20 ms │      no change │
│ QQuery 23 │     1414.44 / 1437.68 ±21.40 / 1472.39 ms │    1341.17 / 1378.70 ±34.44 / 1431.42 ms │      no change │
│ QQuery 24 │         178.21 / 184.00 ±6.08 / 195.26 ms │    1326.45 / 1353.92 ±28.87 / 1396.63 ms │   7.36x slower │
│ QQuery 25 │         267.68 / 276.02 ±6.45 / 285.96 ms │        569.10 / 573.85 ±4.71 / 582.28 ms │   2.08x slower │
│ QQuery 26 │         131.36 / 135.12 ±2.56 / 138.26 ms │        148.64 / 159.67 ±9.30 / 170.57 ms │   1.18x slower │
│ QQuery 27 │               7.00 / 7.22 ±0.14 / 7.41 ms │              6.78 / 6.90 ±0.17 / 7.24 ms │      no change │
│ QQuery 28 │         203.93 / 207.02 ±2.33 / 209.15 ms │        143.46 / 146.33 ±3.00 / 151.77 ms │  +1.41x faster │
│ QQuery 29 │         221.88 / 228.65 ±9.99 / 248.43 ms │        446.33 / 456.21 ±5.22 / 460.65 ms │   2.00x slower │
│ QQuery 30 │            50.62 / 52.21 ±1.28 / 54.21 ms │           44.95 / 45.54 ±0.31 / 45.83 ms │  +1.15x faster │
│ QQuery 31 │         165.86 / 169.31 ±2.77 / 174.33 ms │        178.01 / 182.15 ±2.41 / 185.29 ms │   1.08x slower │
│ QQuery 32 │            14.27 / 14.57 ±0.23 / 14.96 ms │           13.06 / 13.26 ±0.28 / 13.80 ms │  +1.10x faster │
│ QQuery 33 │         122.18 / 123.09 ±0.75 / 124.06 ms │        141.09 / 144.82 ±3.05 / 148.84 ms │   1.18x slower │
│ QQuery 34 │               6.97 / 7.10 ±0.16 / 7.40 ms │              7.07 / 7.21 ±0.21 / 7.63 ms │      no change │
│ QQuery 35 │         134.27 / 139.56 ±8.52 / 156.51 ms │        122.69 / 127.90 ±5.49 / 134.66 ms │  +1.09x faster │
│ QQuery 36 │               6.44 / 6.62 ±0.18 / 6.92 ms │              7.00 / 7.16 ±0.10 / 7.27 ms │   1.08x slower │
│ QQuery 37 │               5.26 / 5.34 ±0.05 / 5.41 ms │              9.03 / 9.15 ±0.14 / 9.39 ms │   1.72x slower │
│ QQuery 38 │         106.70 / 110.38 ±3.03 / 115.33 ms │          97.48 / 99.59 ±1.41 / 101.69 ms │  +1.11x faster │
│ QQuery 39 │         125.88 / 128.63 ±2.85 / 133.32 ms │        125.94 / 134.01 ±6.29 / 142.84 ms │      no change │
│ QQuery 40 │         130.53 / 134.98 ±5.25 / 144.80 ms │        106.72 / 109.33 ±2.13 / 112.48 ms │  +1.23x faster │
│ QQuery 41 │            15.43 / 15.53 ±0.08 / 15.66 ms │           15.91 / 16.15 ±0.17 / 16.43 ms │      no change │
│ QQuery 42 │         105.32 / 108.74 ±5.71 / 120.11 ms │        122.03 / 122.89 ±0.83 / 124.36 ms │   1.13x slower │
│ QQuery 43 │               5.99 / 6.11 ±0.09 / 6.22 ms │              5.33 / 5.46 ±0.19 / 5.84 ms │  +1.12x faster │
│ QQuery 44 │            11.63 / 11.85 ±0.19 / 12.14 ms │           10.48 / 10.57 ±0.15 / 10.87 ms │  +1.12x faster │
│ QQuery 45 │            42.41 / 43.29 ±0.74 / 44.57 ms │           71.20 / 73.71 ±2.43 / 78.32 ms │   1.70x slower │
│ QQuery 46 │               9.08 / 9.25 ±0.15 / 9.52 ms │              8.93 / 9.18 ±0.17 / 9.42 ms │      no change │
│ QQuery 47 │        734.49 / 783.36 ±42.32 / 860.92 ms │       669.46 / 699.97 ±22.73 / 725.69 ms │  +1.12x faster │
│ QQuery 48 │         441.64 / 456.34 ±9.60 / 471.80 ms │        324.58 / 335.64 ±8.94 / 347.76 ms │  +1.36x faster │
│ QQuery 49 │         261.18 / 263.38 ±1.70 / 265.49 ms │        246.77 / 250.69 ±2.76 / 255.18 ms │      no change │
│ QQuery 50 │        488.27 / 508.12 ±16.68 / 534.38 ms │        468.26 / 478.86 ±7.15 / 488.19 ms │  +1.06x faster │
│ QQuery 51 │        198.40 / 208.04 ±11.07 / 228.86 ms │        186.26 / 191.52 ±3.68 / 196.43 ms │  +1.09x faster │
│ QQuery 52 │         102.93 / 104.50 ±1.19 / 106.27 ms │        122.45 / 126.14 ±3.63 / 132.93 ms │   1.21x slower │
│ QQuery 53 │         130.27 / 133.64 ±4.03 / 141.53 ms │        128.55 / 130.54 ±1.82 / 133.82 ms │      no change │
│ QQuery 54 │         117.36 / 122.05 ±6.58 / 134.89 ms │        155.04 / 158.43 ±1.87 / 160.44 ms │   1.30x slower │
│ QQuery 55 │         101.43 / 104.23 ±2.41 / 107.89 ms │        120.07 / 121.31 ±0.87 / 122.33 ms │   1.16x slower │
│ QQuery 56 │         123.26 / 125.95 ±2.01 / 129.40 ms │        144.16 / 145.64 ±1.22 / 147.86 ms │   1.16x slower │
│ QQuery 57 │         175.59 / 179.70 ±3.86 / 185.12 ms │        175.42 / 178.72 ±3.04 / 183.39 ms │      no change │
│ QQuery 58 │         210.48 / 214.33 ±2.88 / 218.60 ms │        298.91 / 301.58 ±2.85 / 307.01 ms │   1.41x slower │
│ QQuery 59 │         254.19 / 260.90 ±5.36 / 267.64 ms │        223.62 / 226.43 ±2.32 / 229.22 ms │  +1.15x faster │
│ QQuery 60 │         126.70 / 129.68 ±3.59 / 136.56 ms │        142.88 / 144.76 ±1.35 / 146.43 ms │   1.12x slower │
│ QQuery 61 │            13.37 / 13.48 ±0.11 / 13.68 ms │           12.46 / 12.63 ±0.17 / 12.96 ms │  +1.07x faster │
│ QQuery 62 │        867.04 / 881.21 ±12.39 / 901.82 ms │       882.33 / 895.74 ±16.01 / 925.66 ms │      no change │
│ QQuery 63 │         129.02 / 134.44 ±3.77 / 138.84 ms │        126.99 / 128.82 ±1.71 / 131.95 ms │      no change │
│ QQuery 64 │ 28780.11 / 29343.27 ±712.24 / 30720.08 ms │        961.13 / 970.77 ±8.79 / 986.68 ms │ +30.23x faster │
│ QQuery 65 │        328.17 / 341.74 ±11.99 / 357.00 ms │        280.81 / 289.60 ±8.50 / 304.17 ms │  +1.18x faster │
│ QQuery 66 │         175.58 / 181.37 ±5.30 / 190.43 ms │        153.67 / 158.94 ±3.89 / 163.35 ms │  +1.14x faster │
│ QQuery 67 │        465.25 / 495.82 ±25.22 / 524.31 ms │       309.04 / 323.63 ±12.46 / 340.27 ms │  +1.53x faster │
│ QQuery 68 │               9.20 / 9.31 ±0.11 / 9.52 ms │              8.62 / 8.88 ±0.28 / 9.42 ms │      no change │
│ QQuery 69 │         153.81 / 159.08 ±3.90 / 164.51 ms │        110.45 / 114.03 ±3.55 / 120.19 ms │  +1.40x faster │
│ QQuery 70 │         403.66 / 409.56 ±4.68 / 417.16 ms │        355.49 / 367.02 ±6.55 / 375.38 ms │  +1.12x faster │
│ QQuery 71 │         122.28 / 126.72 ±5.19 / 136.45 ms │        148.51 / 151.31 ±2.01 / 153.79 ms │   1.19x slower │
│ QQuery 72 │     1175.65 / 1192.35 ±11.44 / 1207.07 ms │     1029.62 / 1040.35 ±7.58 / 1050.82 ms │  +1.15x faster │
│ QQuery 73 │               7.30 / 7.47 ±0.13 / 7.64 ms │              7.26 / 7.41 ±0.08 / 7.51 ms │      no change │
│ QQuery 74 │         479.35 / 488.79 ±6.83 / 497.40 ms │       505.10 / 536.48 ±16.72 / 552.72 ms │   1.10x slower │
│ QQuery 75 │         258.02 / 266.36 ±4.69 / 272.17 ms │        276.84 / 278.77 ±1.83 / 282.07 ms │      no change │
│ QQuery 76 │         264.14 / 271.20 ±4.06 / 275.19 ms │        140.94 / 144.79 ±2.03 / 146.96 ms │  +1.87x faster │
│ QQuery 77 │         210.96 / 214.80 ±3.80 / 222.00 ms │        202.78 / 206.97 ±4.26 / 214.95 ms │      no change │
│ QQuery 78 │         283.50 / 292.59 ±5.34 / 298.35 ms │        287.49 / 298.14 ±6.10 / 304.41 ms │      no change │
│ QQuery 79 │         239.88 / 247.46 ±6.04 / 253.49 ms │        262.65 / 270.15 ±9.21 / 288.09 ms │   1.09x slower │
│ QQuery 80 │         259.27 / 265.86 ±4.05 / 271.66 ms │        280.93 / 282.64 ±0.93 / 283.63 ms │   1.06x slower │
│ QQuery 81 │            29.99 / 30.47 ±0.37 / 30.92 ms │           29.73 / 30.39 ±0.47 / 30.92 ms │      no change │
│ QQuery 82 │            42.86 / 45.04 ±1.63 / 47.29 ms │           40.26 / 41.32 ±0.65 / 42.06 ms │  +1.09x faster │
│ QQuery 83 │            43.37 / 45.16 ±1.06 / 46.28 ms │           43.03 / 44.19 ±1.40 / 46.94 ms │      no change │
│ QQuery 84 │            61.25 / 63.00 ±1.68 / 65.26 ms │           62.36 / 62.69 ±0.36 / 63.35 ms │      no change │
│ QQuery 85 │         248.63 / 257.48 ±7.55 / 269.15 ms │        252.92 / 255.07 ±2.53 / 259.87 ms │      no change │
│ QQuery 86 │            46.03 / 47.63 ±1.07 / 49.10 ms │           39.31 / 39.65 ±0.32 / 40.09 ms │  +1.20x faster │
│ QQuery 87 │               3.58 / 3.72 ±0.19 / 4.08 ms │              3.50 / 3.60 ±0.13 / 3.86 ms │      no change │
│ QQuery 88 │         115.90 / 125.95 ±6.46 / 133.11 ms │        115.32 / 117.91 ±2.22 / 120.81 ms │  +1.07x faster │
│ QQuery 89 │         146.22 / 157.82 ±6.46 / 165.30 ms │        140.12 / 142.07 ±3.12 / 148.19 ms │  +1.11x faster │
│ QQuery 90 │            22.98 / 24.86 ±1.16 / 26.02 ms │           24.24 / 24.65 ±0.35 / 25.13 ms │      no change │
│ QQuery 91 │           90.26 / 94.48 ±3.52 / 100.67 ms │           85.89 / 87.41 ±1.27 / 89.16 ms │  +1.08x faster │
│ QQuery 92 │            50.58 / 53.48 ±3.14 / 59.51 ms │           57.20 / 58.07 ±0.62 / 58.79 ms │   1.09x slower │
│ QQuery 93 │         180.62 / 183.99 ±4.42 / 192.71 ms │        171.53 / 175.68 ±4.49 / 183.83 ms │      no change │
│ QQuery 94 │            65.21 / 66.25 ±0.61 / 67.02 ms │           65.27 / 67.43 ±1.64 / 69.42 ms │      no change │
│ QQuery 95 │         141.94 / 144.74 ±1.63 / 146.37 ms │        152.43 / 154.01 ±1.51 / 156.75 ms │   1.06x slower │
│ QQuery 96 │            76.33 / 78.73 ±2.30 / 81.84 ms │           80.96 / 82.21 ±1.12 / 83.65 ms │      no change │
│ QQuery 97 │         139.05 / 144.44 ±3.23 / 148.02 ms │        124.79 / 126.58 ±1.50 / 128.88 ms │  +1.14x faster │
│ QQuery 98 │         114.73 / 116.81 ±2.10 / 120.50 ms │        120.53 / 122.86 ±1.39 / 124.22 ms │   1.05x slower │
│ QQuery 99 │  10826.29 / 10849.43 ±19.10 / 10881.08 ms │ 10832.95 / 10894.25 ±44.76 / 10960.67 ms │      no change │
└───────────┴───────────────────────────────────────────┴──────────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 61002.03ms │
│ Total Time (adaptive-filters-in-decoder)   │ 33808.00ms │
│ Average Time (HEAD)                        │   616.18ms │
│ Average Time (adaptive-filters-in-decoder) │   341.49ms │
│ Queries Faster                             │         35 │
│ Queries Slower                             │         31 │
│ Queries with No Change                     │         33 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 310.1s
Peak memory 18.1 GiB
Avg memory 9.9 GiB
CPU user 600.6s
CPU sys 13.1s
Peak spill 0 B

tpcds — branch

Metric Value
Wall time 170.0s
Peak memory 6.2 GiB
Avg memory 5.6 GiB
CPU user 226.1s
CPU sys 6.7s
Peak spill 0 B

File an issue against this benchmark runner

Adds a `LimitedBatchCoalescer` to `AdaptiveParquetStream`'s post-scan
filter path, mirroring `FilterExec`'s behavior. Without this, inline
post-scan filtering yields tiny batches (1-100 rows each on selective
predicates) directly to TopK, which delays the dynamic filter from
tightening: TopK only progressively improves its threshold one small
batch at a time, while `FilterExec`'s coalescer ensures the first
batch to TopK already contains thousands of survivors and lets TopK
pick a near-optimal top-K threshold in one shot.

Symptom this fixes: on `Q26` (`SELECT SearchPhrase FROM hits WHERE
SearchPhrase <> '' ORDER BY EventTime LIMIT 10`) at 12 partitions,
branch matches 33-34 file ranges vs main+pushdown=false's 28. With
the coalescer, branch matches 30-32 — closing ~1/3 of the gap. The
remaining ~2-pruning difference is unexplained but small.

Coalescer params match `FilterExec`: target_batch_size from session,
biggest_coalesce_batch_size = target/2 (set inside
`LimitedBatchCoalescer::new`).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@adriangb
Copy link
Copy Markdown
Owner Author

run benchmark clickbench_partitioned

baseline:
    ref: main
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: false
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: false
changed:
    ref: HEAD
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: true
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: true

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4345518850-1914-gnp9n 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing HEAD (d146ebe) to main diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃           adaptive-filters-in-decoder ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.21 / 4.67 ±6.83 / 18.32 ms │          1.21 / 4.63 ±6.76 / 18.15 ms │      no change │
│ QQuery 1  │        12.34 / 12.88 ±0.27 / 13.10 ms │        13.17 / 13.56 ±0.22 / 13.82 ms │   1.05x slower │
│ QQuery 2  │        36.97 / 37.50 ±0.60 / 38.34 ms │        36.42 / 36.72 ±0.23 / 37.00 ms │      no change │
│ QQuery 3  │        31.41 / 31.99 ±0.60 / 32.90 ms │        31.01 / 31.20 ±0.14 / 31.42 ms │      no change │
│ QQuery 4  │     243.50 / 246.62 ±4.14 / 254.58 ms │     246.23 / 249.93 ±2.06 / 252.18 ms │      no change │
│ QQuery 5  │     285.40 / 286.90 ±1.13 / 288.87 ms │     281.54 / 284.57 ±3.17 / 290.45 ms │      no change │
│ QQuery 6  │           6.38 / 6.70 ±0.28 / 7.06 ms │           6.13 / 6.52 ±0.49 / 7.47 ms │      no change │
│ QQuery 7  │        13.72 / 14.00 ±0.34 / 14.62 ms │        14.21 / 15.27 ±1.80 / 18.86 ms │   1.09x slower │
│ QQuery 8  │     326.21 / 332.37 ±4.01 / 336.34 ms │     321.17 / 323.11 ±1.86 / 326.32 ms │      no change │
│ QQuery 9  │     452.72 / 455.34 ±2.85 / 460.59 ms │     448.33 / 450.81 ±2.85 / 455.76 ms │      no change │
│ QQuery 10 │        75.45 / 76.21 ±0.88 / 77.87 ms │        68.80 / 70.80 ±1.31 / 72.00 ms │  +1.08x faster │
│ QQuery 11 │        85.97 / 86.48 ±0.57 / 87.59 ms │        79.73 / 80.40 ±0.75 / 81.82 ms │  +1.08x faster │
│ QQuery 12 │     277.12 / 280.65 ±3.04 / 285.00 ms │     273.14 / 276.14 ±3.15 / 281.82 ms │      no change │
│ QQuery 13 │     395.24 / 400.89 ±3.50 / 405.99 ms │     388.44 / 396.35 ±5.78 / 405.18 ms │      no change │
│ QQuery 14 │     288.66 / 293.17 ±2.61 / 296.34 ms │     278.94 / 282.63 ±2.04 / 284.64 ms │      no change │
│ QQuery 15 │     283.05 / 290.28 ±5.99 / 300.93 ms │    281.81 / 293.01 ±11.73 / 314.86 ms │      no change │
│ QQuery 16 │     626.43 / 634.16 ±4.55 / 639.99 ms │     608.85 / 614.49 ±5.50 / 622.22 ms │      no change │
│ QQuery 17 │     635.18 / 644.25 ±6.78 / 653.17 ms │     611.71 / 618.21 ±4.57 / 625.75 ms │      no change │
│ QQuery 18 │ 1272.67 / 1304.39 ±18.65 / 1325.33 ms │ 1222.25 / 1235.95 ±12.49 / 1258.34 ms │  +1.06x faster │
│ QQuery 19 │        29.60 / 32.24 ±2.45 / 36.03 ms │       28.04 / 38.60 ±20.48 / 79.56 ms │   1.20x slower │
│ QQuery 20 │     519.23 / 530.65 ±8.20 / 542.81 ms │     517.89 / 524.84 ±6.68 / 536.71 ms │      no change │
│ QQuery 21 │     596.21 / 601.40 ±3.64 / 606.21 ms │    631.36 / 643.73 ±14.06 / 661.58 ms │   1.07x slower │
│ QQuery 22 │  1059.95 / 1067.02 ±4.95 / 1075.04 ms │     898.11 / 905.34 ±5.29 / 913.22 ms │  +1.18x faster │
│ QQuery 23 │ 3340.58 / 3381.57 ±21.26 / 3400.07 ms │    248.87 / 269.90 ±12.05 / 281.70 ms │ +12.53x faster │
│ QQuery 24 │        42.68 / 42.97 ±0.17 / 43.11 ms │        32.44 / 33.49 ±1.33 / 36.04 ms │  +1.28x faster │
│ QQuery 25 │     115.04 / 117.67 ±1.93 / 120.70 ms │     116.08 / 122.98 ±5.92 / 131.10 ms │      no change │
│ QQuery 26 │        42.92 / 44.47 ±2.82 / 50.10 ms │        50.76 / 53.90 ±4.19 / 62.10 ms │   1.21x slower │
│ QQuery 27 │     674.98 / 680.97 ±5.16 / 690.57 ms │     670.43 / 674.24 ±3.15 / 678.99 ms │      no change │
│ QQuery 28 │ 3004.78 / 3031.78 ±23.89 / 3072.55 ms │ 2978.38 / 3011.90 ±18.50 / 3031.17 ms │      no change │
│ QQuery 29 │        42.46 / 49.93 ±6.63 / 60.24 ms │        41.98 / 50.29 ±6.84 / 57.75 ms │      no change │
│ QQuery 30 │     311.32 / 314.70 ±2.66 / 319.02 ms │     304.27 / 305.48 ±1.38 / 308.16 ms │      no change │
│ QQuery 31 │     310.78 / 317.75 ±4.43 / 323.69 ms │     300.38 / 306.96 ±7.12 / 316.38 ms │      no change │
│ QQuery 32 │  1024.20 / 1030.05 ±3.94 / 1034.84 ms │   969.91 / 981.12 ±12.14 / 1001.90 ms │      no change │
│ QQuery 33 │  1465.22 / 1477.84 ±7.50 / 1488.06 ms │ 1425.25 / 1441.36 ±10.61 / 1457.93 ms │      no change │
│ QQuery 34 │ 1444.84 / 1455.25 ±17.20 / 1489.56 ms │ 1433.53 / 1452.64 ±13.22 / 1473.43 ms │      no change │
│ QQuery 35 │    287.08 / 304.13 ±15.32 / 322.57 ms │    292.32 / 306.33 ±18.28 / 341.80 ms │      no change │
│ QQuery 36 │        60.64 / 64.08 ±2.58 / 67.53 ms │        61.92 / 68.09 ±7.40 / 81.61 ms │   1.06x slower │
│ QQuery 37 │        36.52 / 42.76 ±5.72 / 51.47 ms │        35.70 / 38.76 ±3.33 / 44.37 ms │  +1.10x faster │
│ QQuery 38 │        40.85 / 43.28 ±2.28 / 47.42 ms │        36.11 / 41.34 ±4.01 / 47.32 ms │      no change │
│ QQuery 39 │     126.89 / 132.01 ±5.42 / 141.98 ms │     115.28 / 124.54 ±7.03 / 135.90 ms │  +1.06x faster │
│ QQuery 40 │        14.38 / 16.66 ±3.11 / 22.83 ms │        18.49 / 20.25 ±2.91 / 26.03 ms │   1.22x slower │
│ QQuery 41 │        14.23 / 15.78 ±2.63 / 21.02 ms │        17.08 / 18.26 ±1.74 / 21.73 ms │   1.16x slower │
│ QQuery 42 │        13.21 / 13.80 ±0.85 / 15.45 ms │        16.40 / 17.65 ±1.72 / 20.99 ms │   1.28x slower │
└───────────┴───────────────────────────────────────┴───────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 20248.25ms │
│ Total Time (adaptive-filters-in-decoder)   │ 16736.28ms │
│ Average Time (HEAD)                        │   470.89ms │
│ Average Time (adaptive-filters-in-decoder) │   389.22ms │
│ Queries Faster                             │          8 │
│ Queries Slower                             │          9 │
│ Queries with No Change                     │         26 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 30.7 GiB
Avg memory 23.4 GiB
CPU user 1075.2s
CPU sys 62.3s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 85.0s
Peak memory 30.8 GiB
Avg memory 23.4 GiB
CPU user 885.5s
CPU sys 50.7s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Owner Author

run benchmark clickbench_partitioned

baseline:
    ref: main
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: false
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: false
changed:
    ref: 04e7aab88
    env:
       DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS: true
       DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS: true

Benchmark 04e7aab (no coalescer, no prune_rate gate) to isolate coalescer impact

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4345685099-1917-q8jlm 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing 04e7aab (04e7aab) to main diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

Comparing HEAD and adaptive-filters-in-decoder
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃          adaptive-filters-in-decoder ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.19 / 4.56 ±6.67 / 17.90 ms │         1.21 / 4.56 ±6.65 / 17.85 ms │      no change │
│ QQuery 1  │        12.36 / 12.60 ±0.14 / 12.78 ms │       13.54 / 13.67 ±0.10 / 13.84 ms │   1.08x slower │
│ QQuery 2  │        36.22 / 36.51 ±0.28 / 36.88 ms │       35.81 / 36.06 ±0.26 / 36.56 ms │      no change │
│ QQuery 3  │        31.18 / 31.98 ±0.91 / 33.75 ms │       30.36 / 30.53 ±0.12 / 30.70 ms │      no change │
│ QQuery 4  │     238.35 / 240.26 ±1.90 / 243.63 ms │    238.76 / 241.93 ±2.18 / 245.33 ms │      no change │
│ QQuery 5  │     283.27 / 284.82 ±1.60 / 287.35 ms │    279.26 / 280.56 ±1.47 / 283.40 ms │      no change │
│ QQuery 6  │           6.57 / 7.10 ±0.61 / 8.12 ms │          5.13 / 6.45 ±1.07 / 7.90 ms │  +1.10x faster │
│ QQuery 7  │        13.46 / 14.27 ±1.33 / 16.93 ms │       14.33 / 14.72 ±0.33 / 15.29 ms │      no change │
│ QQuery 8  │     325.03 / 328.87 ±3.67 / 335.18 ms │    316.99 / 318.90 ±1.71 / 321.43 ms │      no change │
│ QQuery 9  │     442.96 / 450.74 ±6.57 / 460.81 ms │    443.19 / 455.25 ±9.23 / 470.04 ms │      no change │
│ QQuery 10 │        73.10 / 73.73 ±0.40 / 74.27 ms │       69.12 / 70.52 ±1.38 / 73.12 ms │      no change │
│ QQuery 11 │        84.20 / 84.91 ±0.86 / 86.51 ms │       79.31 / 80.62 ±1.19 / 82.74 ms │  +1.05x faster │
│ QQuery 12 │     274.88 / 279.07 ±2.65 / 281.85 ms │    263.78 / 269.22 ±5.08 / 277.40 ms │      no change │
│ QQuery 13 │    390.88 / 403.23 ±15.57 / 433.63 ms │    411.89 / 416.12 ±2.60 / 420.06 ms │      no change │
│ QQuery 14 │     284.35 / 288.11 ±3.76 / 294.82 ms │    269.14 / 274.04 ±4.71 / 281.36 ms │      no change │
│ QQuery 15 │     281.01 / 283.18 ±1.83 / 285.74 ms │   279.65 / 290.04 ±12.65 / 314.63 ms │      no change │
│ QQuery 16 │     611.98 / 620.64 ±5.51 / 626.38 ms │    597.28 / 604.09 ±4.76 / 610.89 ms │      no change │
│ QQuery 17 │     618.32 / 625.99 ±4.25 / 630.17 ms │    598.35 / 610.35 ±9.16 / 621.88 ms │      no change │
│ QQuery 18 │ 1243.03 / 1262.25 ±13.50 / 1281.28 ms │ 1199.79 / 1211.05 ±7.19 / 1220.79 ms │      no change │
│ QQuery 19 │        28.09 / 31.39 ±5.57 / 42.47 ms │       27.59 / 32.32 ±7.80 / 47.90 ms │      no change │
│ QQuery 20 │     516.28 / 521.16 ±7.33 / 535.71 ms │    513.98 / 517.34 ±3.20 / 523.31 ms │      no change │
│ QQuery 21 │     595.46 / 600.35 ±5.77 / 611.05 ms │    621.10 / 632.55 ±9.17 / 642.28 ms │   1.05x slower │
│ QQuery 22 │ 1060.75 / 1073.04 ±11.76 / 1089.08 ms │    876.99 / 888.66 ±6.51 / 897.00 ms │  +1.21x faster │
│ QQuery 23 │ 3317.10 / 3339.13 ±17.98 / 3371.73 ms │    164.59 / 168.66 ±4.29 / 175.88 ms │ +19.80x faster │
│ QQuery 24 │        41.56 / 44.54 ±5.49 / 55.51 ms │       31.18 / 33.95 ±5.12 / 44.18 ms │  +1.31x faster │
│ QQuery 25 │     112.34 / 114.21 ±1.20 / 115.86 ms │    115.74 / 116.50 ±0.86 / 118.17 ms │      no change │
│ QQuery 26 │        41.98 / 43.30 ±1.24 / 45.58 ms │       49.66 / 53.36 ±6.17 / 65.66 ms │   1.23x slower │
│ QQuery 27 │     667.97 / 674.76 ±8.38 / 690.89 ms │   640.93 / 649.32 ±12.17 / 673.44 ms │      no change │
│ QQuery 28 │ 2996.61 / 3012.90 ±13.74 / 3034.41 ms │ 2971.89 / 2985.06 ±8.05 / 2995.50 ms │      no change │
│ QQuery 29 │        41.95 / 47.68 ±6.63 / 57.01 ms │       41.51 / 47.45 ±6.74 / 57.87 ms │      no change │
│ QQuery 30 │     305.77 / 310.97 ±4.96 / 317.56 ms │    306.55 / 311.61 ±5.80 / 322.47 ms │      no change │
│ QQuery 31 │     304.15 / 310.11 ±5.40 / 318.38 ms │    362.30 / 369.74 ±5.78 / 379.41 ms │   1.19x slower │
│ QQuery 32 │   995.67 / 1003.35 ±7.06 / 1016.44 ms │   950.86 / 959.98 ±10.54 / 978.66 ms │      no change │
│ QQuery 33 │  1416.07 / 1426.74 ±7.94 / 1436.94 ms │ 1411.33 / 1421.28 ±7.05 / 1430.89 ms │      no change │
│ QQuery 34 │ 1425.85 / 1437.41 ±10.01 / 1452.92 ms │ 1424.21 / 1431.52 ±6.42 / 1439.22 ms │      no change │
│ QQuery 35 │    285.33 / 306.58 ±33.51 / 373.13 ms │   284.87 / 300.20 ±16.65 / 325.11 ms │      no change │
│ QQuery 36 │        64.45 / 70.94 ±4.94 / 77.02 ms │       60.92 / 68.50 ±9.03 / 84.28 ms │      no change │
│ QQuery 37 │        34.59 / 35.63 ±0.70 / 36.28 ms │       34.82 / 36.99 ±2.14 / 40.73 ms │      no change │
│ QQuery 38 │        40.54 / 44.95 ±5.48 / 55.70 ms │       36.29 / 41.85 ±7.46 / 56.54 ms │  +1.07x faster │
│ QQuery 39 │     129.45 / 135.19 ±5.16 / 144.67 ms │    117.58 / 125.56 ±4.53 / 131.15 ms │  +1.08x faster │
│ QQuery 40 │        14.05 / 14.22 ±0.24 / 14.68 ms │       17.74 / 21.80 ±5.94 / 33.55 ms │   1.53x slower │
│ QQuery 41 │        13.47 / 13.70 ±0.13 / 13.83 ms │       16.53 / 16.81 ±0.29 / 17.36 ms │   1.23x slower │
│ QQuery 42 │        12.93 / 13.26 ±0.24 / 13.63 ms │       15.98 / 18.33 ±3.97 / 26.26 ms │   1.38x slower │
└───────────┴───────────────────────────────────────┴──────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 19958.35ms │
│ Total Time (adaptive-filters-in-decoder)   │ 16478.03ms │
│ Average Time (HEAD)                        │   464.15ms │
│ Average Time (adaptive-filters-in-decoder) │   383.21ms │
│ Queries Faster                             │          7 │
│ Queries Slower                             │          7 │
│ Queries with No Change                     │         29 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 105.0s
Peak memory 30.1 GiB
Avg memory 23.2 GiB
CPU user 1058.2s
CPU sys 61.6s
Peak spill 0 B

clickbench_partitioned — branch

Metric Value
Wall time 85.0s
Peak memory 30.2 GiB
Avg memory 23.0 GiB
CPU user 872.7s
CPU sys 49.1s
Peak spill 0 B

File an issue against this benchmark runner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants